@@ -16,7 +16,7 @@ Notice that with the new Amp API **you never need to explicitly convert your mod
To train a model, create softlinks to the Imagenet dataset, then run `main.py` with the desired model architecture, as shown in `Example commands` below.
The default learning rate schedule is set for ResNet50. `main_amp.py` script rescales the learning rate according to the global batch size (number of distributed processes x per-process minibatch size).
The default learning rate schedule is set for ResNet50. `main_amp.py` script rescales the learning rate according to the global batch size (number of distributed processes \* per-process minibatch size).
## Example commands
...
...
@@ -26,59 +26,65 @@ The default learning rate schedule is set for ResNet50. `main_amp.py` script re
CPU data loading bottlenecks.
**Note:**`--opt-level``O1` and `O2` both use dynamic loss scaling by default unless manually overridden.
`--opt-level``O0` and `O3` (the "pure" training modes) do not use loss scaling by default, but they
can also be told to use loss scaling via manual overrides. Using loss scaling with `O0`
(pure FP32 training) does not really make sense, though, and will trigger a warning.
`--opt-level``O0` and `O3` (the "pure" training modes) do not use loss scaling by default.
`O0` and `O3` can be told to use loss scaling via manual overrides, but using loss scaling with `O0`
(pure FP32 training) does not really make sense, and will trigger a warning.
```bash
### Softlink training dataset into current directory
Softlink training and validation dataset into current directory
```
$ ln -sf /data/imagenet/train-jpeg/ train
### Softlink validation dataset into current directory
$ ln -sf /data/imagenet/val-jpeg/ val
```
Single-process "pure fp32" training
### `--opt-level O0` (FP32 training) and `O3` (FP16 training)