README.md 2.98 KB
Newer Older
Xin Pan's avatar
Xin Pan committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
<font size=4><b>Reproduced ResNet on CIFAR-10 and CIFAR-100 dataset.</b></font>

contact: panyx0718 (xpan@google.com)

<b>Dataset:</b>

https://www.cs.toronto.edu/~kriz/cifar.html

<b>Related papers:</b>

Identity Mappings in Deep Residual Networks

https://arxiv.org/pdf/1603.05027v2.pdf

Deep Residual Learning for Image Recognition

https://arxiv.org/pdf/1512.03385v1.pdf

Wide Residual Networks

https://arxiv.org/pdf/1605.07146v1.pdf

<b>Settings:</b>

* Random split 50k training set into 45k/5k train/eval split.
* Pad to 36x36 and random crop. Horizontal flip. Per-image whitenting. 
* Momentum optimizer 0.9.
* Learning rate schedule: 0.1 (40k), 0.01 (60k), 0.001 (>60k).
* L2 weight decay: 0.002.
* Batch size: 128. (28-10 wide and 1001 layer bottleneck use 64)

<b>Results:</b>

<left>
![Precisions](g3doc/cifar_resnet.gif)
</left>
<left>
![Precisions Legends](g3doc/cifar_resnet_legends.gif)
</left>


CIFAR-10 Model|Best Precision|Steps
--------------|--------------|------
32 layer|92.5%|~80k
110 layer|93.6%|~80k
164 layer bottleneck|94.5%|~80k
1001 layer bottleneck|94.9%|~80k
28-10 wide|95%|~90k

CIFAR-100 Model|Best Precision|Steps
---------------|--------------|-----
32 layer|68.1%|~45k
110 layer|71.3%|~60k
164 layer bottleneck|75.7%|~50k
1001 layer bottleneck|78.2%|~70k
28-10 wide|78.3%|~70k

<b>Prerequisite:</b>

1. Install TensorFlow, Bazel.

2. Download CIFAR-10/CIFAR-100 dataset.

```shell
curl -o cifar-10-binary.tar.gz https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz
curl -o cifar-100-binary.tar.gz https://www.cs.toronto.edu/~kriz/cifar-100-binary.tar.gz
```

<b>How to run:</b>

```shell
# cd to the your workspace.
# It contains an empty WORKSPACE file, resnet codes and cifar10 dataset.
74
# Note: User can split 5k from train set for eval set.
Xin Pan's avatar
Xin Pan committed
75
76
77
78
79
ls -R
  .:
  cifar10  resnet  WORKSPACE

  ./cifar10:
80
81
  data_batch_1.bin  data_batch_2.bin  data_batch_3.bin  data_batch_4.bin
  data_batch_5.bin  test_batch.bin
Xin Pan's avatar
Xin Pan committed
82
83
84
85
86
87
88
89

  ./resnet:
  BUILD  cifar_input.py  g3doc  README.md  resnet_main.py  resnet_model.py

# Build everything for GPU.
bazel build -c opt --config=cuda resnet/...

# Train the model.
90
bazel-bin/resnet/resnet_main --train_data_path=cifar10/data_batch* \
Xin Pan's avatar
Xin Pan committed
91
92
93
94
95
                             --log_root=/tmp/resnet_model \
                             --train_dir=/tmp/resnet_model/train \
                             --dataset='cifar10' \
                             --num_gpus=1

96
# Note that the training script will not produce any output. In the meantime, you can check on its progress using tensorboard:
Neal Wu's avatar
Neal Wu committed
97
98
tensorboard --logdir=/tmp/resnet_model

Xin Pan's avatar
Xin Pan committed
99
100
101
# Evaluate the model.
# Avoid running on the same GPU as the training job at the same time,
# otherwise, you might run out of memory.
102
bazel-bin/resnet/resnet_main --eval_data_path=cifar10/test_batch.bin \
Xin Pan's avatar
Xin Pan committed
103
104
105
106
107
108
                             --log_root=/tmp/resnet_model \
                             --eval_dir=/tmp/resnet_model/test \
                             --mode=eval \
                             --dataset='cifar10' \
                             --num_gpus=0
```