Commit b90b0570 authored by Michael Carilli's avatar Michael Carilli
Browse files

Minor typos

parent a3dbea38
......@@ -30,7 +30,7 @@ CPU data loading bottlenecks.
`O0` and `O3` can be told to use loss scaling via manual overrides, but using loss scaling with `O0`
(pure FP32 training) does not really make sense, and will trigger a warning.
Softlink training and validation dataset into current directory
Softlink training and validation dataset into current directory:
```
$ ln -sf /data/imagenet/train-jpeg/ train
$ ln -sf /data/imagenet/val-jpeg/ val
......@@ -42,7 +42,7 @@ Amp enables easy experimentation with various pure and mixed precision options.
```
$ python main_amp.py -a resnet50 --b 128 --workers 4 --opt-level O0 ./
$ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O3 ./
$ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O3 --keep-batchnorm-FP32 True ./
$ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O3 --keep-batchnorm-fp32 True ./
$ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O1 ./
$ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O1 --loss-scale 128.0 ./
$ python -m torch.distributed.launch --nproc_per_node=2 main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O1 ./
......@@ -64,7 +64,7 @@ $ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O3 ./
```
FP16 training with FP32 batchnorm:
```
$ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O3 --keep-batchnorm-FP32 True ./
$ python main_amp.py -a resnet50 --b 224 --workers 4 --opt-level O3 --keep-batchnorm-fp32 True ./
```
Keeping the batchnorms in FP32 improves stability and allows Pytorch
to use cudnn batchnorms, which significantly increases speed in Resnet50.
......@@ -72,8 +72,8 @@ to use cudnn batchnorms, which significantly increases speed in Resnet50.
The `O3` options might not converge, because they are not true mixed precision.
However, they can be useful to establish "speed of light" performance for
your model, which provides a baseline for comparison with `O1` and `O2`.
For Resnet50 in particular, `--opt-level O3 --keep-batchnorm-FP32 True` establishes
the "speed of light." (Without `--keep-batchnorm-FP32`, it's slower, because it does
For Resnet50 in particular, `--opt-level O3 --keep-batchnorm-fp32 True` establishes
the "speed of light." (Without `--keep-batchnorm-fp32`, it's slower, because it does
not use cudnn batchnorm.)
#### `--opt-level O1` ("conservative mixed precision")
......
......@@ -95,15 +95,10 @@ def fast_collate(batch):
best_prec1 = 0
args = parser.parse_args()
# Let multi_tensor_applier be the canary in the coalmine
# that verifies if the backend is what we think it is
assert multi_tensor_applier.available == args.has_ext
print("opt_level = {}".format(args.opt_level))
print("keep_batchnorm_fp32 = {}".format(args.keep_batchnorm_fp32), type(args.keep_batchnorm_fp32))
print("loss_scale = {}".format(args.loss_scale), type(args.loss_scale))
print("\nCUDNN VERSION: {}\n".format(torch.backends.cudnn.version()))
if args.deterministic:
......@@ -342,8 +337,8 @@ def train(train_loader, model, criterion, optimizer, epoch):
input, target = prefetcher.next()
if i%args.print_freq == 0:
# Every print_freq iterations, let's check the accuracy and speed.
# For best performance, it doesn't make sense to collect these metrics every
# Every print_freq iterations, check the loss accuracy and speed.
# For best performance, it doesn't make sense to print these metrics every
# iteration, since they incur an allreduce and some host<->device syncs.
# Measure accuracy
......@@ -374,8 +369,8 @@ def train(train_loader, model, criterion, optimizer, epoch):
'Prec@1 {top1.val:.3f} ({top1.avg:.3f})\t'
'Prec@5 {top5.val:.3f} ({top5.avg:.3f})'.format(
epoch, i, len(train_loader),
args.print_freq*args.world_size*args.batch_size/batch_time.val,
args.print_freq*args.world_size*args.batch_size/batch_time.avg,
args.world_size*args.batch_size/batch_time.val,
args.world_size*args.batch_size/batch_time.avg,
batch_time=batch_time,
loss=losses, top1=top1, top5=top5))
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment