Internal change

PiperOrigin-RevId: 335115606

Internal change
PiperOrigin-RevId: 335115606
5a533fd4 · A. Unique TensorFlower · 5b6be76b · 5a533fd4 · 5a533fd4 · 5a533fd4
Commit 5a533fd4 authored Oct 02, 2020 by A. Unique TensorFlower
6 changed files
--- a/official/vision/beta/MODEL_GARDEN.md
+++ b/official/vision/beta/MODEL_GARDEN.md
@@ -5,16 +5,29 @@ TF Vision model garden provides a large collection of baselines and checkpoints
 ## Image Classification
-### Common Settings and Notes
-* We provide ImageNet checkpoints for [ResNet](https://arxiv.org/abs/1512.03385) models.
-* Training details:
-  * All models are trained from scratch for 90 epochs with batch size 4096 and 1.6 initial stepwise decay learning rate.
-  * Unless noted, all models are trained with l2 weight regularization and ReLU activation.
 ### ImageNet Baselines
-| model        | resolution    | epochs  | FLOPs (B)    | params (M)  |  Top-1  |  Top-5  | download |
+#### Models trained with vanilla settings:
-| ------------ |:-------------:| ---------:|-----------:|--------:|--------:|---------:|---------:|
+* Models are trained from scratch with batch size 4096 and 1.6 initial learning rate.
-| ResNet-50    | 224x224       |    90    | 4.1 | 25.6 | 76.1 | 92.9 | config |
+* Linear warmup is applied for the first 5 epochs.
+* Models trained with l2 weight regularization and ReLU activation.
+| model        | resolution    | epochs  |  Top-1  |  Top-5  | download |
+| ------------ |:-------------:|--------:|--------:|---------:|---------:|
+| ResNet-50    | 224x224       |    90    | 76.1 | 92.9 | config |
+| ResNet-50    | 224x224       |    200   | 77.1 | 93.5 | config |
+| ResNet-101   | 224x224       |    200   | 78.3 | 94.2 | config |
+| ResNet-152   | 224x224       |    200   | 78.7 | 94.3 | config |
+#### Models trained with training features including:
+* Label smoothing 0.1.
+* Swish activation.
+| model        | resolution    | epochs  |   Top-1  |  Top-5  | download |
+| ------------ |:-------------:| ---------:|--------:|---------:|---------:|
+| ResNet-50    | 224x224       |    200    | 78.1 | 93.9 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet50_tpu.yaml) |
+| ResNet-101   | 224x224       |    200    | 79.1 | 94.5 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet101_tpu.yaml) |
+| ResNet-152   | 224x224       |    200    | 79.4 | 94.7 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet152_tpu.yaml) |
+| ResNet-200   | 224x224       |    200    | 79.9 | 94.8 | [config](https://github.com/tensorflow/models/blob/master/official/vision/beta/configs/experiments/image_classification/imagenet_resnet200_tpu.yaml) |

--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnet101_tpu.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnet101_tpu.yaml
+# ResNet-101 ImageNet classification. 79.1% top-1 and 94.5% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [224, 224, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 101
+    norm_activation:
+      activation: 'swish'
+  losses:
+    l2_weight_decay: 0.0001
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 62400
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 62400
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnet152_tpu.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnet152_tpu.yaml
+# ResNet-152 ImageNet classification. 79.4% top-1 and 94.7% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [224, 224, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 152
+    norm_activation:
+      activation: 'swish'
+  losses:
+    l2_weight_decay: 0.0001
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 62400
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 62400
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnet200_tpu.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnet200_tpu.yaml
+# ResNet-200 ImageNet classification. 79.9% top-1 and 94.8% top-5 accuracy.
+runtime:
+  distribution_strategy: 'tpu'
+  mixed_precision_dtype: 'bfloat16'
+task:
+  model:
+    num_classes: 1001
+    input_size: [224, 224, 3]
+    backbone:
+      type: 'resnet'
+      resnet:
+        model_id: 200
+    norm_activation:
+      activation: 'swish'
+  losses:
+    l2_weight_decay: 0.0001
+    one_hot: true
+    label_smoothing: 0.1
+  train_data:
+    input_path: 'imagenet-2012-tfrecord/train*'
+    is_training: true
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+  validation_data:
+    input_path: 'imagenet-2012-tfrecord/valid*'
+    is_training: false
+    global_batch_size: 4096
+    dtype: 'bfloat16'
+    drop_remainder: false
+trainer:
+  train_steps: 62400
+  validation_steps: 13
+  validation_interval: 312
+  steps_per_loop: 312
+  summary_interval: 312
+  checkpoint_interval: 312
+  optimizer_config:
+    optimizer:
+      type: 'sgd'
+      sgd:
+        momentum: 0.9
+    learning_rate:
+      type: 'cosine'
+      cosine:
+        initial_learning_rate: 1.6
+        decay_steps: 62400
+    warmup:
+      type: 'linear'
+      linear:
+        warmup_steps: 1560
--- a/official/vision/beta/configs/experiments/image_classification/imagenet_resnet50_tpu.yaml
+++ b/official/vision/beta/configs/experiments/image_classification/imagenet_resnet50_tpu.yaml
+# ResNet-50 ImageNet classification. 78.1% top-1 and 93.9% top-5 accuracy.
 runtime:
  distribution_strategy: 'tpu'
  mixed_precision_dtype: 'bfloat16'
@@ -9,23 +10,25 @@ task:
      type: 'resnet'
      resnet:
        model_id: 50
+    norm_activation:
+      activation: 'swish'
  losses:
    l2_weight_decay: 0.0001
-    one_hot: True
+    one_hot: true
    label_smoothing: 0.1
  train_data:
    input_path: 'imagenet-2012-tfrecord/train*'
-    is_training: True
+    is_training: true
    global_batch_size: 4096
    dtype: 'bfloat16'
  validation_data:
    input_path: 'imagenet-2012-tfrecord/valid*'
-    is_training: False
+    is_training: false
    global_batch_size: 4096
    dtype: 'bfloat16'
-    drop_remainder: False
+    drop_remainder: false
 trainer:
-  train_steps: 28080
+  train_steps: 62400
  validation_steps: 13
  validation_interval: 312
  steps_per_loop: 312
@@ -37,10 +40,10 @@ trainer:
      sgd:
        momentum: 0.9
    learning_rate:
-      type: 'stepwise'
+      type: 'cosine'
-      stepwise:
+      cosine:
-        boundaries: [9360, 18720, 24960]
+        initial_learning_rate: 1.6
-        values: [1.6, 0.16, 0.016, 0.0016]
+        decay_steps: 62400
    warmup:
      type: 'linear'
      linear:

--- a/official/vision/beta/configs/image_classification.py
+++ b/official/vision/beta/configs/image_classification.py
@@ -93,6 +93,8 @@ def image_classification_imagenet() -> cfg.ExperimentConfig:
          model=ImageClassificationModel(
              num_classes=1001,
              input_size=[224, 224, 3],
+              backbone=backbones.Backbone(
+                  type='resnet', resnet=backbones.ResNet(model_id=50)),
              norm_activation=common.NormActivation(
                  norm_momentum=0.9, norm_epsilon=1e-5)),
          losses=Losses(l2_weight_decay=1e-4),