Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
apex
Commits
2b8277e5
Commit
2b8277e5
authored
Nov 10, 2018
by
Michael Carilli
Browse files
Updating example instructions to use batch size 224 for safety
parent
8bd382fa
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
6 deletions
+6
-6
examples/imagenet/README.md
examples/imagenet/README.md
+6
-6
No files found.
examples/imagenet/README.md
View file @
2b8277e5
...
...
@@ -45,7 +45,7 @@ Optionally one can run imagenet with sync batch normalization by adding
## Example commands
(note: batch size
`--b 2
56
`
assumes your GPUs have >=16GB of onboard memory)
(note: batch size
`--b 2
24
`
assumes your GPUs have >=16GB of onboard memory)
```
bash
### Softlink training dataset into current directory
...
...
@@ -53,16 +53,16 @@ $ ln -sf /data/imagenet/train-jpeg/ train
### Softlink validation dataset into current directory
$
ln
-sf
/data/imagenet/val-jpeg/ val
### Single-process training
$
python main.py
-a
resnet50
--fp16
--b
2
56
--workers
4
--static-loss-scale
128.0 ./
$
python main.py
-a
resnet50
--fp16
--b
2
24
--workers
4
--static-loss-scale
128.0 ./
### Multi-process training (uses all visible GPUs on the node)
$
python
-m
torch.distributed.launch
--nproc_per_node
=
NUM_GPUS main.py
-a
resnet50
--fp16
--b
2
56
--workers
4
--static-loss-scale
128.0 ./
$
python
-m
torch.distributed.launch
--nproc_per_node
=
NUM_GPUS main.py
-a
resnet50
--fp16
--b
2
24
--workers
4
--static-loss-scale
128.0 ./
### Multi-process training on GPUs 0 and 1 only
$
export
CUDA_VISIBLE_DEVICES
=
0,1
$
python
-m
torch.distributed.launch
--nproc_per_node
=
2 main.py
-a
resnet50
--fp16
--b
2
56
--workers
4 ./
$
python
-m
torch.distributed.launch
--nproc_per_node
=
2 main.py
-a
resnet50
--fp16
--b
2
24
--workers
4 ./
### Multi-process training with FP16_Optimizer, static loss scale 128.0 (still uses FP32 master params)
$
python
-m
torch.distributed.launch
--nproc_per_node
=
NUM_GPUS main_fp16_optimizer.py
-a
resnet50
--fp16
--b
2
56
--static-loss-scale
128.0
--workers
4 ./
$
python
-m
torch.distributed.launch
--nproc_per_node
=
NUM_GPUS main_fp16_optimizer.py
-a
resnet50
--fp16
--b
2
24
--static-loss-scale
128.0
--workers
4 ./
### Multi-process training with FP16_Optimizer, dynamic loss scaling
$
python
-m
torch.distributed.launch
--nproc_per_node
=
NUM_GPUS main_fp16_optimizer.py
-a
resnet50
--fp16
--b
2
56
--dynamic-loss-scale
--workers
4 ./
$
python
-m
torch.distributed.launch
--nproc_per_node
=
NUM_GPUS main_fp16_optimizer.py
-a
resnet50
--fp16
--b
2
24
--dynamic-loss-scale
--workers
4 ./
```
## Usage for `main.py` and `main_fp16_optimizer.py`
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment