Unverified Commit b1ae9a23 authored by D. Khuê Lê-Huu's avatar D. Khuê Lê-Huu Committed by GitHub
Browse files

Clarification for training resnext101_32x8d on ImageNet (#4390)



* Fix training resuming in references/segmentation

* Clarification for training resnext101_32x8d

* Update references/classification/README.md
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>
parent 12fd3a62
...@@ -40,12 +40,17 @@ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\ ...@@ -40,12 +40,17 @@ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\
### ResNext-101 32x8d ### ResNext-101 32x8d
On 8 nodes, each with 8 GPUs (for a total of 64 GPUS)
``` ```
python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\
--model resnext101_32x8d --epochs 100 --model resnext101_32x8d --epochs 100
``` ```
Note that the above command corresponds to a single node with 8 GPUs. If you use
a different number of GPUs and/or a different batch size, then the learning rate
should be scaled accordingly. For example, the pretrained model provided by
`torchvision` was trained on 8 nodes, each with 8 GPUs (for a total of 64 GPUs),
with `--batch_size 16` and `--lr 0.4`, instead of the current defaults
which are respectively batch_size=32 and lr=0.1
### MobileNetV2 ### MobileNetV2
``` ```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment