dict(type='TensorboardLoggerHook')])# The logger used to record the training process.
dict(type='TensorboardLoggerHook')])# The logger used to record the training process.
total_epochs=36# Total epochs to train the model
runner=dict(type='EpochBasedRunner',max_epochs=36)# Runner that runs the workflow in total max_epochs
dist_params=dict(backend='nccl')# Parameters to setup distributed training, the port can also be set.
dist_params=dict(backend='nccl')# Parameters to setup distributed training, the port can also be set.
log_level='INFO'# The level of logging.
log_level='INFO'# The level of logging.
find_unused_parameters=True# Whether to find unused parameters
find_unused_parameters=True# Whether to find unused parameters
work_dir=None# Directory to save the model checkpoints and logs for the current experiments.
work_dir=None# Directory to save the model checkpoints and logs for the current experiments.
load_from=None# load models as a pre-trained model from a given path. This will not resume training.
load_from=None# load models as a pre-trained model from a given path. This will not resume training.
resume_from=None# Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
resume_from=None# Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
workflow=[('train',1)]# Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 36 epochs according to the total_epochs.
workflow=[('train',1)]# Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 36 epochs according to the max_epochs.
@@ -179,7 +179,7 @@ so that 1 epoch for training and 1 epoch for validation will be run iteratively.
...
@@ -179,7 +179,7 @@ so that 1 epoch for training and 1 epoch for validation will be run iteratively.
**Note**:
**Note**:
1. The parameters of model will not be updated during val epoch.
1. The parameters of model will not be updated during val epoch.
2. Keyword `total_epochs` in the config only controls the number of training epochs and will not affect the validation workflow.
2. Keyword `max_epochs` in `runner` in the config only controls the number of training epochs and will not affect the validation workflow.
3. Workflows `[('train', 1), ('val', 1)]` and `[('train', 1)]` will not change the behavior of `EvalHook` because `EvalHook` is called by `after_train_epoch` and validation workflow only affect hooks that are called through `after_val_epoch`. Therefore, the only difference between `[('train', 1), ('val', 1)]` and `[('train', 1)]` is that the runner will calculate losses on validation set after each training epoch.
3. Workflows `[('train', 1), ('val', 1)]` and `[('train', 1)]` will not change the behavior of `EvalHook` because `EvalHook` is called by `after_train_epoch` and validation workflow only affect hooks that are called through `after_val_epoch`. Therefore, the only difference between `[('train', 1), ('val', 1)]` and `[('train', 1)]` is that the runner will calculate losses on validation set after each training epoch.