Unverified Commit 965cc3ee authored by Ayushman Kumar's avatar Ayushman Kumar Committed by GitHub
Browse files

Merge pull request #7 from tensorflow/master

updated
parents 1f3247f4 1f685c54
* @tensorflow/tf-garden-team
* @tensorflow/tf-garden-team @tensorflow/tf-model-garden-team
/official/ @rachellj218 @saberkun
/official/bert @saberkun @hongjunChoi @rachellj218
/research/adv_imagenet_models/ @alexeykurakin
/research/adversarial_crypto/ @dave-andersen
/research/adversarial_logit_pairing/ @AlexeyKurakin
/research/adversarial_logit_pairing/ @alexeykurakin
/research/adversarial_text/ @rsepassi @a-dai
/research/adv_imagenet_models/ @AlexeyKurakin
/research/attention_ocr/ @alexgorban
/research/audioset/ @plakal @dpwe
/research/autoaugment/* @barretzoph
......@@ -14,10 +14,13 @@
/research/compression/ @nmjohn
/research/cvt_text/ @clarkkev @lmthang
/research/deep_contextual_bandits/ @rikel
/research/deep_speech/ @yhliang2018
/research/deeplab/ @aquariusjay @yknzhu @gpapan
/research/delf/ @andrefaraujo
/research/domain_adaptation/ @bousmalis @dmrd
/research/efficient-hrl/ @ofirnachum
/research/feelvos/ @pvoigtlaender @yuningchai @aquariusjay
/research/fivo/ @dieterichlawson
/research/global_objectives/ @mackeya-google
/research/im2txt/ @cshallue
/research/inception/ @shlens @vincentvanhoucke
......@@ -26,7 +29,7 @@
/research/learning_to_remember_rare_events/ @lukaszkaiser @ofirnachum
/research/learning_unsupervised_learning/ @lukemetz @nirum
/research/lexnet_nc/ @vered1986 @waterson
/research/lfads/ @jazcollins @susillo
/research/lfads/ @jazcollins @sussillo
/research/lm_1b/ @oriolvinyals @panyx0718
/research/lm_commonsense/ @thtrieu
/research/lstm_object_detection/ @dreamdragon @masonliuw @yinxiaoli @yongzhe2160
......@@ -39,9 +42,10 @@
/research/object_detection/ @jch1 @tombstone @derekjchow @jesu9 @dreamdragon @pkulzc
/research/pcl_rl/ @ofirnachum
/research/ptn/ @xcyan @arkanath @hellojas @honglaklee
/research/qa_kg/ @yuyuz
/research/real_nvp/ @laurent-dinh
/research/rebar/ @gjtucker
/research/resnet/ @panyx0718
/research/sentiment_analysis/ @sculd
/research/seq2species/ @apbusia @depristo
/research/skip_thoughts/ @cshallue
/research/slim/ @sguada @nathansilberman
......@@ -50,15 +54,7 @@
/research/struct2depth/ @aneliaangelova
/research/swivel/ @waterson
/research/tcn/ @coreylynch @sermanet
/research/tensorrt/ @karmel
/research/textsum/ @panyx0718 @peterjliu
/research/transformer/ @daviddao
/research/vid2depth/ @rezama
/research/video_prediction/ @cbfinn
/research/fivo/ @dieterichlawson
/samples/ @MarkDaoust @lamberta
/samples/languages/java/ @asimshankar
/tutorials/embedding/ @zffchen78 @a-dai
/tutorials/image/ @sherrym @shlens
/tutorials/image/cifar10_estimator/ @protoget
/tutorials/rnn/ @lukaszkaiser @ebrevdo
# TensorFlow Models
![Logo](https://storage.googleapis.com/model_garden_artifacts/TF_Model_Garden.png)
This repository contains a number of different models implemented in [TensorFlow](https://www.tensorflow.org):
# Welcome to the Model Garden for TensorFlow
The [official models](official) are a collection of example models that use TensorFlow 2's high-level APIs. They are intended to be well-maintained, tested, and kept up to date with the latest stable TensorFlow API. They should also be reasonably optimized for fast performance while still being easy to read. We especially recommend newer TensorFlow users to start here.
The TensorFlow Model Garden is a repository with a number of different implementations of state-of-the-art (SOTA) models and modeling solutions for TensorFlow users. We aim to demonstrate the best practices for modeling so that TensorFlow users can take full advantage of TensorFlow for their research and product development.
The [research models](https://github.com/tensorflow/models/tree/master/research) are a large collection of models implemented in TensorFlow by researchers. They are not officially supported or available in release branches; it is up to the individual researchers to maintain the models and/or provide support on issues and pull requests.
## Structure
| Folder | Description |
|-----------|-------------|
| [official](official) | • **A collection of example implementations for SOTA models using the latest TensorFlow 2's high-level APIs**<br />• Officially maintained, supported, and kept up to date with the latest TensorFlow 2 APIs<br />• Reasonably optimized for fast performance while still being easy to read |
| [research](research) | • A collection of research model implementations in TensorFlow 1 or 2 by researchers<br />• Up to the individual researchers to maintain the model implementations and/or provide support on issues and pull requests |
## Contribution guidelines
If you want to contribute to models, be sure to review the [contribution guidelines](CONTRIBUTING.md).
If you want to contribute to models, please review the [contribution guidelines](CONTRIBUTING.md).
## License
......
# Offically Supported TensorFlow 2.1 Models on Cloud TPU
# Offically Supported TensorFlow 2.1+ Models on Cloud TPU
## Natural Language Processing
* [bert](nlp/bert): A powerful pre-trained language representation model:
BERT, which stands for Bidirectional Encoder Representations from
Transformers.
[BERT FineTuning with Cloud TPU](https://cloud.google.com/tpu/docs/tutorials/bert-2.x) provides step by step instructions on Cloud TPU training. You can look [Bert MNLI Tensorboard.dev metrics](https://tensorboard.dev/experiment/mIah5lppTASvrHqWrdr6NA) for MNLI fine tuning task.
[BERT FineTuning with Cloud TPU](https://cloud.google.com/tpu/docs/tutorials/bert-2.x) provides step by step instructions on Cloud TPU training. You can look [Bert MNLI Tensorboard.dev metrics](https://tensorboard.dev/experiment/LijZ1IrERxKALQfr76gndA) for MNLI fine tuning task.
* [transformer](nlp/transformer): A transformer model to translate the WMT
English to German dataset.
[Training transformer on Cloud TPU](https://cloud.google.com/tpu/docs/tutorials/transformer-2.x) for step by step instructions on Cloud TPU training.
## Computer Vision
* [efficientnet](vision/image_classification): A family of convolutional
neural networks that scale by balancing network depth, width, and
resolution and can be used to classify ImageNet's dataset of 1000 classes.
See [Tensorboard.dev training metrics](https://tensorboard.dev/experiment/KnaWjrq5TXGfv0NW5m7rpg/#scalars).
* [mnist](vision/image_classification): A basic model to classify digits
from the MNIST dataset. See [Running MNIST on Cloud TPU](https://cloud.google.com/tpu/docs/tutorials/mnist-2.x) tutorial and [Tensorboard.dev metrics](https://tensorboard.dev/experiment/mIah5lppTASvrHqWrdr6NA).
* [mask-rcnn](vision/detection): An object detection and instance segmentation model. See [Tensorboard.dev training metrics](https://tensorboard.dev/experiment/LH7k0fMsRwqUAcE09o9kPA).
* [resnet](vision/image_classification): A deep residual network that can
be used to classify ImageNet's dataset of 1000 classes.
See [Training ResNet on Cloud TPU](https://cloud.google.com/tpu/docs/tutorials/resnet-2.x) tutorial and [Tensorboard.dev metrics](https://tensorboard.dev/experiment/CxlDK8YMRrSpYEGtBRpOhg).
......
# TensorFlow Official Models
![Logo](https://storage.googleapis.com/model_garden_artifacts/TF_Model_Garden.png)
The TensorFlow official models are a collection of models that use
TensorFlow's high-level APIs. They are intended to be well-maintained, tested,
and kept up to date with the latest TensorFlow API. They should also be
reasonably optimized for fast performance while still being easy to read.
# TensorFlow Official Models
These models are used as end-to-end tests, ensuring that the models run with the
same or improved speed and performance with each new TensorFlow build.
The TensorFlow official models are a collection of models
that use TensorFlow’s high-level APIs.
They are intended to be well-maintained, tested, and kept up to date
with the latest TensorFlow API.
They should also be reasonably optimized for fast performance while still
being easy to read.
These models are used as end-to-end tests, ensuring that the models run
with the same or improved speed and performance with each new TensorFlow build.
## Tensorflow releases
## Model Implementations
The master branch of the models are **in development** with TensorFlow 2.x, and
they target the
[nightly binaries](https://github.com/tensorflow/tensorflow#installation) built
from the
[master branch of TensorFlow](https://github.com/tensorflow/tensorflow/tree/master).
You may start from installing with pip:
### Natural Language Processing
```shell
pip3 install tf-nightly
```
| Model | Description | Reference |
| ----- | ----------- | --------- |
| [ALBERT](nlp/albert) | A Lite BERT for Self-supervised Learning of Language Representations | [arXiv:1909.11942](https://arxiv.org/abs/1909.11942) |
| [BERT](nlp/bert) | A powerful pre-trained language representation model: BERT (Bidirectional Encoder Representations from Transformers) | [arXiv:1810.04805](https://arxiv.org/abs/1810.04805) |
| [NHNet](nlp/nhnet) | A transformer-based multi-sequence to sequence model: Generating Representative Headlines for News Stories | [arXiv:2001.09386](https://arxiv.org/abs/2001.09386) |
| [Transformer](nlp/transformer) | A transformer model to translate the WMT English to German dataset | [arXiv:1706.03762](https://arxiv.org/abs/1706.03762) |
| [XLNet](nlp/xlnet) | XLNet: Generalized Autoregressive Pretraining for Language Understanding | [arXiv:1906.08237](https://arxiv.org/abs/1906.08237) |
**Stable versions** of the official models targeting releases of TensorFlow are
available as tagged branches or
[downloadable releases](https://github.com/tensorflow/models/releases). Model
repository version numbers match the target TensorFlow release, such that
[release v2.1.0](https://github.com/tensorflow/models/releases/tag/v2.1.0) are
compatible with
[TensorFlow v2.1.0](https://github.com/tensorflow/tensorflow/releases/tag/v2.1.0).
### Computer Vision
If you are on a version of TensorFlow earlier than 1.4, please
[update your installation](https://www.tensorflow.org/install/).
| Model | Description | Reference |
| ----- | ----------- | --------- |
| [MNIST](vision/image_classification) | A basic model to classify digits from the MNIST dataset | [Link](http://yann.lecun.com/exdb/mnist/) |
| [ResNet](vision/image_classification) | A deep residual network for image recognition | [arXiv:1512.03385](https://arxiv.org/abs/1512.03385) |
| [RetinaNet](vision/detection) | A fast and powerful object detector | [arXiv:1708.02002](https://arxiv.org/abs/1708.02002) |
| [Mask R-CNN](vision/detection) | An object detection and instance segmentation model | [arXiv:1703.06870](https://arxiv.org/abs/1703.06870) |
## Requirements
### Other models
Please follow the below steps before running models in this repo:
| Model | Description | Reference |
| ----- | ----------- | --------- |
| [NCF](recommendation) | Neural Collaborative Filtering model for recommendation tasks | [arXiv:1708.05031](https://arxiv.org/abs/1708.05031) |
1. TensorFlow
[nightly binaries](https://github.com/tensorflow/tensorflow#installation)
---
2. If users would like to clone this repo but do not care about change history,
please consider:
## How to get started with the Model Garden official models
```shell
export repo_version="master"
git clone -b ${repo_version} https://github.com/tensorflow/models.git --depth=1
```
* The models in the master branch are developed using TensorFlow 2,
and they target the TensorFlow [nightly binaries](https://github.com/tensorflow/tensorflow#installation)
built from the
[master branch of TensorFlow](https://github.com/tensorflow/tensorflow/tree/master).
* The stable versions targeting releases of TensorFlow are available
as tagged branches or [downloadable releases](https://github.com/tensorflow/models/releases).
* Model repository version numbers match the target TensorFlow release,
such that
[release v2.1.0](https://github.com/tensorflow/models/releases/tag/v2.1.0)
are compatible with
[TensorFlow v2.1.0](https://github.com/tensorflow/tensorflow/releases/tag/v2.1.0).
3. Add the top-level ***/models*** folder to the Python path with the command:
Please follow the below steps before running models in this repository.
```shell
export PYTHONPATH=$PYTHONPATH:/path/to/models
```
### Requirements
Using Colab:
* The latest TensorFlow Model Garden release and TensorFlow 2
* If you are on a version of TensorFlow earlier than 2.1, please
upgrade your TensorFlow to [the latest TensorFlow 2](https://www.tensorflow.org/install/).
```python
import os
os.environ['PYTHONPATH'] += ":/path/to/models"
```
```shell
pip3 install tf-nightly
```
4. Install dependencies:
### Installation
```shell
pip3 install --user -r official/requirements.txt
```
#### Method 1: Install the TensorFlow Model Garden pip package
**tf-models-nightly** is the nightly Model Garden package
created daily automatically. pip will install all models
and dependencies automatically.
To make Official Models easier to use, we are planning to create a pip
installable Official Models package. This is being tracked in
[#917](https://github.com/tensorflow/models/issues/917).
```shell
pip install tf-models-nightly
```
## Available models
Please check out our [example](colab/bert.ipynb)
to learn how to use a PIP package.
**NOTE: For Officially Supported TPU models please check [README-TPU](README-TPU.md).**
#### Method 2: Clone the source
**NOTE:** Please make sure to follow the steps in the
[Requirements](#requirements) section.
1. Clone the GitHub repository:
### Natural Language Processing
```shell
git clone https://github.com/tensorflow/models.git
```
* [albert](nlp/albert): A Lite BERT for Self-supervised Learning of Language
Representations.
* [bert](nlp/bert): A powerful pre-trained language representation model:
BERT, which stands for Bidirectional Encoder Representations from
Transformers.
* [transformer](nlp/transformer): A transformer model to translate the WMT English
to German dataset.
* [xlnet](nlp/xlnet): XLNet: Generalized Autoregressive Pretraining for
Language Understanding.
2. Add the top-level ***/models*** folder to the Python path.
### Computer Vision
```shell
export PYTHONPATH=$PYTHONPATH:/path/to/models
```
* [mnist](vision/image_classification): A basic model to classify digits from
the MNIST dataset.
* [resnet](vision/image_classification): A deep residual network that can be
used to classify both CIFAR-10 and ImageNet's dataset of 1000 classes.
* [retinanet](vision/detection): A fast and powerful object detector.
If you are using a Colab notebook, please set the Python path with os.environ.
### Others
```python
import os
os.environ['PYTHONPATH'] += ":/path/to/models"
```
* [ncf](recommendation): Neural Collaborative Filtering model for
recommendation tasks.
3. Install other dependencies
Models that will not update to TensorFlow 2.x stay inside R1 directory:
```shell
pip3 install --user -r official/requirements.txt
```
* [boosted_trees](r1/boosted_trees): A Gradient Boosted Trees model to
classify higgs boson process from HIGGS Data Set.
* [wide_deep](r1/wide_deep): A model that combines a wide model and deep
network to classify census income data.
---
## More models to come!
We are in the progress to revamp official model garden with TensorFlow 2.0 and
Keras. In the near future, we will bring:
The team is actively developing new models.
In the near future, we will add:
* State-of-the-art language understanding models: XLNet, GPT2, and more
members in Transformer family.
* Start-of-the-art image classification models: EfficientNet, MnasNet and
variants.
* A set of excellent objection detection models.
- State-of-the-art language understanding models:
More members in Transformer family
- Start-of-the-art image classification models:
EfficientNet, MnasNet and variants.
- A set of excellent objection detection models.
If you would like to make any fixes or improvements to the models, please
[submit a pull request](https://github.com/tensorflow/models/compare).
## New Models
---
## Contributions
The team is actively working to add new models to the repository. Every model
should follow the following guidelines, to uphold the our objectives of
readable, usable, and maintainable code.
Every model should follow our guidelines to uphold our objectives of readable,
usable, and maintainable code.
**General guidelines**
### General Guidelines
* Code should be well documented and tested.
* Runnable from a blank environment with relative ease.
* Trainable on: single GPU/CPU (baseline), multiple GPUs, TPU
* Compatible with Python 3 (using [six](https://pythonhosted.org/six/) when
being compatible with Python 2 is necessary)
* Conform to [Google Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md)
- Code should be well documented and tested.
- Runnable from a blank environment with ease.
- Trainable on: single GPU/CPU (baseline), multiple GPUs & TPUs
- Compatible with Python 3 (using [six](https://pythonhosted.org/six/)
when being compatible with Python 2 is necessary)
- Conform to
[Google Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md)
**Implementation guidelines**
### Implementation Guidelines
These guidelines exist so the model implementations are consistent for better
readability and maintainability.
These guidelines are to ensure consistent model implementations for
better readability and maintainability.
* Use [common utility functions](utils)
* Export SavedModel at the end of training.
* Consistent flags and flag-parsing library
([read more here](utils/flags/guidelines.md))
* Produce benchmarks and logs ([read more here](utils/logs/guidelines.md))
- Use [common utility functions](utils)
- Export SavedModel at the end of the training.
- Consistent flags and flag-parsing library ([read more here](utils/flags/guidelines.md))
# Lint as: python3
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Utils to annotate and trace benchmarks."""
from __future__ import absolute_import
......
......@@ -34,7 +34,7 @@ from official.benchmark import bert_benchmark_utils as benchmark_utils
from official.nlp.bert import configs
from official.nlp.bert import run_classifier
from official.utils.misc import distribution_utils
from official.utils.testing import benchmark_wrappers
from official.benchmark import benchmark_wrappers
# pylint: disable=line-too-long
PRETRAINED_CHECKPOINT_PATH = 'gs://cloud-tpu-checkpoints/bert/keras_bert/uncased_L-24_H-1024_A-16/bert_model.ckpt'
......@@ -56,6 +56,7 @@ class BertClassifyBenchmarkBase(benchmark_utils.BertBenchmarkBase):
self.num_epochs = None
self.num_steps_per_epoch = None
self.tpu = tpu
FLAGS.steps_per_loop = 50
@flagsaver.flagsaver
def _run_bert_classifier(self, callbacks=None, use_ds=True):
......@@ -81,8 +82,6 @@ class BertClassifyBenchmarkBase(benchmark_utils.BertBenchmarkBase):
distribution_strategy='mirrored' if use_ds else 'off',
num_gpus=self.num_gpus)
steps_per_loop = 50
max_seq_length = input_meta_data['max_seq_length']
train_input_fn = run_classifier.get_dataset_fn(
FLAGS.train_data_path,
......@@ -101,7 +100,7 @@ class BertClassifyBenchmarkBase(benchmark_utils.BertBenchmarkBase):
FLAGS.model_dir,
epochs,
steps_per_epoch,
steps_per_loop,
FLAGS.steps_per_loop,
eval_steps,
warmup_steps,
FLAGS.learning_rate,
......
......@@ -23,11 +23,11 @@ import time
# pylint: disable=g-bad-import-order
import numpy as np
from absl import flags
import tensorflow.compat.v2 as tf
import tensorflow as tf
# pylint: enable=g-bad-import-order
from official.utils.flags import core as flags_core
from official.utils.testing.perfzero_benchmark import PerfZeroBenchmark
from official.benchmark.perfzero_benchmark import PerfZeroBenchmark
FLAGS = flags.FLAGS
......
......@@ -33,7 +33,7 @@ from official.benchmark import bert_benchmark_utils as benchmark_utils
from official.nlp.bert import run_squad
from official.utils.misc import distribution_utils
from official.utils.misc import keras_utils
from official.utils.testing import benchmark_wrappers
from official.benchmark import benchmark_wrappers
# pylint: disable=line-too-long
......@@ -104,7 +104,6 @@ class BertSquadBenchmarkBase(benchmark_utils.BertBenchmarkBase):
@flagsaver.flagsaver
def _train_squad(self, run_eagerly=False, ds_type='mirrored'):
"""Runs BERT SQuAD training. Uses mirrored strategy by default."""
assert tf.version.VERSION.startswith('2.')
self._init_gpu_and_data_threads()
input_meta_data = self._read_input_meta_data_from_file()
strategy = self._get_distribution_strategy(ds_type)
......@@ -118,7 +117,6 @@ class BertSquadBenchmarkBase(benchmark_utils.BertBenchmarkBase):
@flagsaver.flagsaver
def _evaluate_squad(self, ds_type='mirrored'):
"""Runs BERT SQuAD evaluation. Uses mirrored strategy by default."""
assert tf.version.VERSION.startswith('2.')
self._init_gpu_and_data_threads()
input_meta_data = self._read_input_meta_data_from_file()
strategy = self._get_distribution_strategy(ds_type)
......@@ -128,7 +126,7 @@ class BertSquadBenchmarkBase(benchmark_utils.BertBenchmarkBase):
eval_metrics = run_squad.eval_squad(strategy=strategy,
input_meta_data=input_meta_data)
# Use F1 score as reported evaluation metric.
self.eval_metrics = eval_metrics['f1']
self.eval_metrics = eval_metrics['final_f1']
class BertSquadBenchmarkReal(BertSquadBenchmarkBase):
......@@ -254,7 +252,7 @@ class BertSquadBenchmarkReal(BertSquadBenchmarkBase):
self._setup()
self.num_gpus = 8
FLAGS.model_dir = self._get_model_dir('benchmark_8_gpu_squad')
FLAGS.train_batch_size = 32
FLAGS.train_batch_size = 24
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
......
......@@ -19,9 +19,8 @@ from __future__ import division
from __future__ import print_function
import tensorflow as tf
from official.benchmark.perfzero_benchmark import PerfZeroBenchmark
from official.utils.flags import core as flags_core
from official.utils.testing.perfzero_benchmark import PerfZeroBenchmark
class KerasBenchmark(PerfZeroBenchmark):
......@@ -32,7 +31,6 @@ class KerasBenchmark(PerfZeroBenchmark):
default_flags=None,
flag_methods=None,
tpu=None):
assert tf.version.VERSION.startswith('2.')
super(KerasBenchmark, self).__init__(
output_dir=output_dir,
default_flags=default_flags,
......
......@@ -23,7 +23,7 @@ from absl import flags
import tensorflow as tf # pylint: disable=g-bad-import-order
from official.benchmark import keras_benchmark
from official.utils.testing import benchmark_wrappers
from official.benchmark import benchmark_wrappers
from official.benchmark.models import resnet_cifar_main
MIN_TOP_1_ACCURACY = 0.929
......
# Lint as: python3
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -13,17 +14,22 @@
# limitations under the License.
# ==============================================================================
"""Executes Keras benchmarks and accuracy tests."""
# pylint: disable=line-too-long
from __future__ import print_function
import json
import os
import time
from typing import Any, MutableMapping, Optional
from absl import flags
import tensorflow as tf # pylint: disable=g-bad-import-order
from official.benchmark import benchmark_wrappers
from official.benchmark import keras_benchmark
from official.utils.testing import benchmark_wrappers
from official.vision.image_classification.resnet import resnet_imagenet_main
from official.benchmark.models import resnet_imagenet_main
from official.vision.image_classification import classifier_trainer
MIN_TOP_1_ACCURACY = 0.76
MAX_TOP_1_ACCURACY = 0.77
......@@ -41,10 +47,74 @@ MODEL_OPTIMIZATION_TOP_1_ACCURACY = {
FLAGS = flags.FLAGS
def _get_classifier_parameters(
num_gpus: int = 0,
builder: str = 'records',
skip_eval: bool = False,
distribution_strategy: str = 'mirrored',
per_replica_batch_size: int = 128,
epochs: int = 90,
steps: int = 0,
epochs_between_evals: int = 1,
dtype: str = 'float32',
enable_xla: bool = False,
run_eagerly: bool = False,
gpu_thread_mode: Optional[str] = None,
dataset_num_private_threads: Optional[int] = None,
loss_scale: Optional[str] = None) -> MutableMapping[str, Any]:
"""Gets classifier trainer's ResNet parameters."""
return {
'runtime': {
'num_gpus': num_gpus,
'distribution_strategy': distribution_strategy,
'run_eagerly': run_eagerly,
'enable_xla': enable_xla,
'dataset_num_private_threads': dataset_num_private_threads,
'gpu_thread_mode': gpu_thread_mode,
'loss_scale': loss_scale,
},
'train_dataset': {
'builder': builder,
'use_per_replica_batch_size': True,
'batch_size': per_replica_batch_size,
'image_size': 224,
'dtype': dtype,
},
'validation_dataset': {
'builder': builder,
'batch_size': per_replica_batch_size,
'use_per_replica_batch_size': True,
'image_size': 224,
'dtype': dtype,
},
'train': {
'epochs': epochs,
'steps': steps,
'callbacks': {
'enable_tensorboard': False,
'enable_checkpoint_and_export': False,
'enable_time_history': True,
},
},
'model': {
'loss': {
'label_smoothing': 0.1,
},
},
'evaluation': {
'epochs_between_evals': epochs_between_evals,
'skip_eval': skip_eval,
},
}
class Resnet50KerasAccuracy(keras_benchmark.KerasBenchmark):
"""Benchmark accuracy tests for ResNet50 in Keras."""
def __init__(self, output_dir=None, root_data_dir=None, **kwargs):
def __init__(self,
output_dir: Optional[str] = None,
root_data_dir: Optional[str] = None,
**kwargs):
"""A benchmark class.
Args:
......@@ -55,97 +125,54 @@ class Resnet50KerasAccuracy(keras_benchmark.KerasBenchmark):
named arguments before updating the constructor.
"""
flag_methods = [resnet_imagenet_main.define_imagenet_keras_flags]
flag_methods = [classifier_trainer.define_classifier_flags]
self.data_dir = os.path.join(root_data_dir, 'imagenet')
super(Resnet50KerasAccuracy, self).__init__(
output_dir=output_dir, flag_methods=flag_methods)
def benchmark_8_gpu(self):
"""Test Keras model with eager, dist_strat and 8 GPUs."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.data_dir = self.data_dir
FLAGS.batch_size = 128 * 8
FLAGS.train_epochs = 90
FLAGS.epochs_between_evals = 10
FLAGS.model_dir = self._get_model_dir('benchmark_8_gpu')
FLAGS.dtype = 'fp32'
FLAGS.enable_eager = True
# Add some thread tunings to improve performance.
FLAGS.datasets_num_private_threads = 14
self._run_and_report_benchmark()
def benchmark_8_gpu_amp(self):
"""Test Keras model with eager, dist_strat and 8 GPUs with automatic mixed precision."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.data_dir = self.data_dir
FLAGS.batch_size = 128 * 8
FLAGS.train_epochs = 90
FLAGS.epochs_between_evals = 10
FLAGS.model_dir = self._get_model_dir('benchmark_8_gpu_amp')
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = True
FLAGS.fp16_implementation = 'graph_rewrite'
# Add some thread tunings to improve performance.
FLAGS.datasets_num_private_threads = 14
self._run_and_report_benchmark()
def benchmark_8_gpu_fp16(self):
"""Test Keras model with eager, dist_strat, 8 GPUs, and fp16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.data_dir = self.data_dir
FLAGS.batch_size = 256 * 8
FLAGS.train_epochs = 90
FLAGS.epochs_between_evals = 10
FLAGS.model_dir = self._get_model_dir('benchmark_8_gpu_fp16')
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = True
# Thread tuning to improve performance.
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
def benchmark_xla_8_gpu_fp16(self):
"""Test Keras model with XLA, eager, dist_strat, 8 GPUs and fp16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.data_dir = self.data_dir
FLAGS.batch_size = 256 * 8
FLAGS.train_epochs = 90
FLAGS.epochs_between_evals = 10
FLAGS.model_dir = self._get_model_dir('benchmark_xla_8_gpu_fp16')
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = True
FLAGS.enable_xla = True
# Thread tuning to improve performance.
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
def benchmark_xla_8_gpu_fp16_dynamic(self):
"""Test Keras model with XLA, eager, dist_strat, 8 GPUs, dynamic fp16."""
self._setup()
FLAGS.num_gpus = 8
@benchmark_wrappers.enable_runtime_flags
def _run_and_report_benchmark(
self,
experiment_name: str,
top_1_min: float = MIN_TOP_1_ACCURACY,
top_1_max: float = MAX_TOP_1_ACCURACY,
num_gpus: int = 0,
distribution_strategy: str = 'mirrored',
per_replica_batch_size: int = 128,
epochs: int = 90,
steps: int = 0,
epochs_between_evals: int = 1,
dtype: str = 'float32',
enable_xla: bool = False,
run_eagerly: bool = False,
gpu_thread_mode: Optional[str] = None,
dataset_num_private_threads: Optional[int] = None,
loss_scale: Optional[str] = None):
"""Runs and reports the benchmark given the provided configuration."""
FLAGS.model_type = 'resnet'
FLAGS.dataset = 'imagenet'
FLAGS.mode = 'train_and_eval'
FLAGS.data_dir = self.data_dir
FLAGS.batch_size = 256 * 8
FLAGS.train_epochs = 90
FLAGS.epochs_between_evals = 10
FLAGS.model_dir = self._get_model_dir('benchmark_xla_8_gpu_fp16_dynamic')
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = True
FLAGS.enable_xla = True
FLAGS.loss_scale = 'dynamic'
# Thread tuning to improve performance.
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark(top_1_min=0.736)
FLAGS.model_dir = self._get_model_dir(experiment_name)
parameters = _get_classifier_parameters(
num_gpus=num_gpus,
distribution_strategy=distribution_strategy,
per_replica_batch_size=per_replica_batch_size,
epochs=epochs,
steps=steps,
epochs_between_evals=epochs_between_evals,
dtype=dtype,
enable_xla=enable_xla,
run_eagerly=run_eagerly,
gpu_thread_mode=gpu_thread_mode,
dataset_num_private_threads=dataset_num_private_threads,
loss_scale=loss_scale)
FLAGS.params_override = json.dumps(parameters)
total_batch_size = num_gpus * per_replica_batch_size
@benchmark_wrappers.enable_runtime_flags
def _run_and_report_benchmark(self,
top_1_min=MIN_TOP_1_ACCURACY,
top_1_max=MAX_TOP_1_ACCURACY):
start_time_sec = time.time()
stats = resnet_imagenet_main.run(flags.FLAGS)
stats = classifier_trainer.run(flags.FLAGS)
wall_time_sec = time.time() - start_time_sec
super(Resnet50KerasAccuracy, self)._report_benchmark(
......@@ -153,9 +180,56 @@ class Resnet50KerasAccuracy(keras_benchmark.KerasBenchmark):
wall_time_sec,
top_1_min=top_1_min,
top_1_max=top_1_max,
total_batch_size=FLAGS.batch_size,
total_batch_size=total_batch_size,
log_steps=100)
def benchmark_8_gpu(self):
"""Tests Keras model with eager, dist_strat and 8 GPUs."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_8_gpu',
num_gpus=8,
per_replica_batch_size=128,
epochs=90,
epochs_between_evals=10,
dtype='float32')
def benchmark_8_gpu_fp16(self):
"""Tests Keras model with eager, dist_strat, 8 GPUs, and fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_8_gpu_fp16',
num_gpus=8,
per_replica_batch_size=256,
epochs=90,
epochs_between_evals=10,
dtype='float16')
def benchmark_xla_8_gpu_fp16(self):
"""Tests Keras model with XLA, eager, dist_strat, 8 GPUs and fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu_fp16',
num_gpus=8,
per_replica_batch_size=256,
epochs=90,
epochs_between_evals=10,
dtype='float16',
enable_xla=True)
def benchmark_xla_8_gpu_fp16_dynamic(self):
"""Tests Keras model with XLA, eager, dist_strat, 8 GPUs, dynamic fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu_fp16_dynamic',
top_1_min=0.736,
num_gpus=8,
per_replica_batch_size=256,
epochs=90,
epochs_between_evals=10,
dtype='float16',
loss_scale='dynamic')
def _get_model_dir(self, folder_name):
return os.path.join(self.output_dir, folder_name)
......@@ -197,8 +271,6 @@ class MobilenetV1KerasAccuracy(keras_benchmark.KerasBenchmark):
FLAGS.model_dir = self._get_model_dir('benchmark_8_gpu')
FLAGS.dtype = 'fp32'
FLAGS.enable_eager = True
# Add some thread tunings to improve performance.
FLAGS.datasets_num_private_threads = 14
self._run_and_report_benchmark()
@benchmark_wrappers.enable_runtime_flags
......@@ -221,6 +293,348 @@ class MobilenetV1KerasAccuracy(keras_benchmark.KerasBenchmark):
return os.path.join(self.output_dir, folder_name)
class Resnet50KerasClassifierBenchmarkBase(keras_benchmark.KerasBenchmark):
"""Resnet50 (classifier_trainer) benchmarks."""
def __init__(self, output_dir=None, default_flags=None,
tpu=None, dataset_builder='records', train_epochs=1,
train_steps=110, data_dir=None):
flag_methods = [classifier_trainer.define_classifier_flags]
self.dataset_builder = dataset_builder
self.train_epochs = train_epochs
self.train_steps = train_steps
self.data_dir = data_dir
super(Resnet50KerasClassifierBenchmarkBase, self).__init__(
output_dir=output_dir,
flag_methods=flag_methods,
default_flags=default_flags,
tpu=tpu)
@benchmark_wrappers.enable_runtime_flags
def _run_and_report_benchmark(
self,
experiment_name: str,
skip_steps: Optional[int] = None,
top_1_min: float = MIN_TOP_1_ACCURACY,
top_1_max: float = MAX_TOP_1_ACCURACY,
num_gpus: int = 0,
num_tpus: int = 0,
distribution_strategy: str = 'mirrored',
per_replica_batch_size: int = 128,
epochs_between_evals: int = 1,
dtype: str = 'float32',
enable_xla: bool = False,
run_eagerly: bool = False,
gpu_thread_mode: Optional[str] = None,
dataset_num_private_threads: Optional[int] = None,
loss_scale: Optional[str] = None):
"""Runs and reports the benchmark given the provided configuration."""
FLAGS.model_type = 'resnet'
FLAGS.dataset = 'imagenet'
FLAGS.mode = 'train_and_eval'
FLAGS.data_dir = self.data_dir
FLAGS.model_dir = self._get_model_dir(experiment_name)
parameters = _get_classifier_parameters(
builder=self.dataset_builder,
skip_eval=True,
num_gpus=num_gpus,
distribution_strategy=distribution_strategy,
per_replica_batch_size=per_replica_batch_size,
epochs=self.train_epochs,
steps=self.train_steps,
epochs_between_evals=epochs_between_evals,
dtype=dtype,
enable_xla=enable_xla,
gpu_thread_mode=gpu_thread_mode,
dataset_num_private_threads=dataset_num_private_threads,
loss_scale=loss_scale)
FLAGS.params_override = json.dumps(parameters)
if distribution_strategy == 'tpu':
total_batch_size = num_tpus * per_replica_batch_size
else:
total_batch_size = num_gpus * per_replica_batch_size
start_time_sec = time.time()
stats = classifier_trainer.run(flags.FLAGS)
wall_time_sec = time.time() - start_time_sec
# Number of logged step time entries that are excluded in performance
# report. We keep results from last 100 batches, or skip the steps based on
# input skip_steps.
warmup = (skip_steps or (self.train_steps - 100)) // FLAGS.log_steps
super(Resnet50KerasClassifierBenchmarkBase, self)._report_benchmark(
stats,
wall_time_sec,
total_batch_size=total_batch_size,
log_steps=FLAGS.log_steps,
warmup=warmup,
start_time_sec=start_time_sec)
def benchmark_1_gpu_no_dist_strat(self):
"""Tests Keras model with 1 GPU, no distribution strategy."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_1_gpu_no_dist_strat',
num_gpus=1,
distribution_strategy='off',
per_replica_batch_size=128)
def benchmark_1_gpu_no_dist_strat_run_eagerly(self):
"""Tests Keras model with 1 GPU, no distribution strategy, run eagerly."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_1_gpu_no_dist_strat_run_eagerly',
num_gpus=1,
run_eagerly=True,
distribution_strategy='off',
per_replica_batch_size=64)
def benchmark_1_gpu_no_dist_strat_run_eagerly_fp16(self):
"""Tests with 1 GPU, no distribution strategy, fp16, run eagerly."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_1_gpu_no_dist_strat_run_eagerly_fp16',
num_gpus=1,
run_eagerly=True,
distribution_strategy='off',
dtype='float16',
per_replica_batch_size=128)
def benchmark_1_gpu(self):
"""Tests Keras model with 1 GPU."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_1_gpu',
num_gpus=1,
distribution_strategy='one_device',
per_replica_batch_size=128)
def benchmark_xla_1_gpu(self):
"""Tests Keras model with XLA and 1 GPU."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_1_gpu',
num_gpus=1,
enable_xla=True,
distribution_strategy='one_device',
per_replica_batch_size=128)
def benchmark_1_gpu_fp16(self):
"""Tests Keras model with 1 GPU and fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_1_gpu_fp16',
num_gpus=1,
distribution_strategy='one_device',
dtype='float16',
per_replica_batch_size=256)
def benchmark_1_gpu_fp16_dynamic(self):
"""Tests Keras model with 1 GPU, fp16, and dynamic loss scaling."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_1_gpu_fp16_dynamic',
num_gpus=1,
distribution_strategy='one_device',
dtype='float16',
per_replica_batch_size=256,
loss_scale='dynamic')
def benchmark_xla_1_gpu_fp16(self):
"""Tests Keras model with XLA, 1 GPU and fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_1_gpu_fp16',
num_gpus=1,
enable_xla=True,
distribution_strategy='one_device',
dtype='float16',
per_replica_batch_size=256)
def benchmark_xla_1_gpu_fp16_tweaked(self):
"""Tests Keras model with XLA, 1 GPU, fp16, and manual config tuning."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_1_gpu_fp16_tweaked',
num_gpus=1,
enable_xla=True,
distribution_strategy='one_device',
dtype='float16',
per_replica_batch_size=256,
gpu_thread_mode='gpu_private')
def benchmark_xla_1_gpu_fp16_dynamic(self):
"""Tests Keras model with XLA, 1 GPU, fp16, and dynamic loss scaling."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_1_gpu_fp16_dynamic',
num_gpus=1,
enable_xla=True,
distribution_strategy='one_device',
dtype='float16',
per_replica_batch_size=256,
loss_scale='dynamic')
def benchmark_8_gpu(self):
"""Tests Keras model with 8 GPUs."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_8_gpu',
num_gpus=8,
distribution_strategy='mirrored',
per_replica_batch_size=128)
def benchmark_8_gpu_tweaked(self):
"""Tests Keras model with manual config tuning and 8 GPUs."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_8_gpu_tweaked',
num_gpus=8,
distribution_strategy='mirrored',
per_replica_batch_size=128,
dataset_num_private_threads=14)
def benchmark_xla_8_gpu(self):
"""Tests Keras model with XLA and 8 GPUs."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu',
num_gpus=8,
enable_xla=True,
distribution_strategy='mirrored',
per_replica_batch_size=128)
def benchmark_xla_8_gpu_tweaked(self):
"""Tests Keras model with manual config tuning, 8 GPUs, and XLA."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu_tweaked',
num_gpus=8,
enable_xla=True,
distribution_strategy='mirrored',
per_replica_batch_size=128,
gpu_thread_mode='gpu_private',
dataset_num_private_threads=24)
def benchmark_8_gpu_fp16(self):
"""Tests Keras model with 8 GPUs and fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_8_gpu_fp16',
num_gpus=8,
dtype='float16',
distribution_strategy='mirrored',
per_replica_batch_size=256)
def benchmark_8_gpu_fp16_tweaked(self):
"""Tests Keras model with 8 GPUs, fp16, and manual config tuning."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_8_gpu_fp16_tweaked',
num_gpus=8,
dtype='float16',
distribution_strategy='mirrored',
per_replica_batch_size=256,
gpu_thread_mode='gpu_private',
dataset_num_private_threads=40)
def benchmark_8_gpu_fp16_dynamic_tweaked(self):
"""Tests Keras model with 8 GPUs, fp16, dynamic loss scaling, and tuned."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_8_gpu_fp16_dynamic_tweaked',
num_gpus=8,
dtype='float16',
distribution_strategy='mirrored',
per_replica_batch_size=256,
loss_scale='dynamic',
gpu_thread_mode='gpu_private',
dataset_num_private_threads=40)
def benchmark_xla_8_gpu_fp16(self):
"""Tests Keras model with XLA, 8 GPUs and fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu_fp16',
dtype='float16',
num_gpus=8,
enable_xla=True,
distribution_strategy='mirrored',
per_replica_batch_size=256)
def benchmark_xla_8_gpu_fp16_tweaked(self):
"""Test Keras model with manual config tuning, XLA, 8 GPUs and fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu_fp16_tweaked',
dtype='float16',
num_gpus=8,
enable_xla=True,
distribution_strategy='mirrored',
per_replica_batch_size=256,
gpu_thread_mode='gpu_private',
dataset_num_private_threads=48)
def benchmark_xla_8_gpu_fp16_tweaked_delay_measure(self):
"""Tests with manual config tuning, XLA, 8 GPUs and fp16.
Delay performance measurement for stable performance on 96 vCPU platforms.
"""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu_fp16_tweaked_delay_measure',
dtype='float16',
num_gpus=8,
enable_xla=True,
distribution_strategy='mirrored',
per_replica_batch_size=256,
gpu_thread_mode='gpu_private',
dataset_num_private_threads=48,
steps=310)
def benchmark_xla_8_gpu_fp16_dynamic_tweaked(self):
"""Tests Keras model with config tuning, XLA, 8 GPUs and dynamic fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_xla_8_gpu_fp16_dynamic_tweaked',
dtype='float16',
num_gpus=8,
enable_xla=True,
distribution_strategy='mirrored',
per_replica_batch_size=256,
gpu_thread_mode='gpu_private',
loss_scale='dynamic',
dataset_num_private_threads=48)
def benchmark_2x2_tpu_fp16(self):
"""Test Keras model with 2x2 TPU, fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_2x2_tpu_fp16',
dtype='bfloat16',
num_tpus=8,
distribution_strategy='tpu',
per_replica_batch_size=128)
def benchmark_4x4_tpu_fp16(self):
"""Test Keras model with 4x4 TPU, fp16."""
self._setup()
self._run_and_report_benchmark(
experiment_name='benchmark_4x4_tpu_fp16',
dtype='bfloat16',
num_tpus=32,
distribution_strategy='tpu',
per_replica_batch_size=128)
def fill_report_object(self, stats):
super(Resnet50KerasClassifierBenchmarkBase, self).fill_report_object(
stats,
total_batch_size=FLAGS.batch_size,
log_steps=FLAGS.log_steps)
class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
"""Resnet50 benchmarks."""
......@@ -318,18 +732,6 @@ class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
FLAGS.batch_size = 128
self._run_and_report_benchmark()
def benchmark_graph_1_gpu_no_dist_strat(self):
"""Test Keras model in legacy graph mode with 1 GPU, no dist strat."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'off'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_1_gpu_no_dist_strat')
FLAGS.batch_size = 96 # BatchNorm is less efficient in legacy graph mode
# due to its reliance on v1 cond.
self._run_and_report_benchmark()
def benchmark_1_gpu(self):
"""Test Keras model with 1 GPU."""
self._setup()
......@@ -446,69 +848,6 @@ class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
FLAGS.loss_scale = 'dynamic'
self._run_and_report_benchmark()
def benchmark_graph_1_gpu(self):
"""Test Keras model in legacy graph mode with 1 GPU."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_1_gpu')
FLAGS.batch_size = 128
self._run_and_report_benchmark()
def benchmark_graph_xla_1_gpu(self):
"""Test Keras model in legacy graph mode with XLA and 1 GPU."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_xla_1_gpu')
FLAGS.batch_size = 128
self._run_and_report_benchmark()
def benchmark_graph_1_gpu_fp16(self):
"""Test Keras model in legacy graph mode with 1 GPU and fp16."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_1_gpu_fp16')
FLAGS.batch_size = 256
self._run_and_report_benchmark()
def benchmark_graph_xla_1_gpu_fp16(self):
"""Test Keras model in legacy graph mode with 1 GPU, fp16 and XLA."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_xla_1_gpu_fp16')
FLAGS.batch_size = 256
self._run_and_report_benchmark()
def benchmark_graph_xla_1_gpu_fp16_tweaked(self):
"""Test Keras model in legacy graph with 1 GPU, fp16, XLA, and tuning."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir(
'benchmark_graph_xla_1_gpu_fp16_tweaked')
FLAGS.dtype = 'fp16'
FLAGS.batch_size = 256
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
def benchmark_8_gpu(self):
"""Test Keras model with 8 GPUs."""
self._setup()
......@@ -608,6 +947,7 @@ class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
FLAGS.model_dir = self._get_model_dir('benchmark_8_gpu_fp16_tweaked')
FLAGS.batch_size = 256 * 8 # 8 GPUs
FLAGS.tf_gpu_thread_mode = 'gpu_private'
FLAGS.dataset_num_private_threads = 40
self._run_and_report_benchmark()
def benchmark_8_gpu_fp16_dynamic_tweaked(self):
......@@ -623,6 +963,7 @@ class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
FLAGS.batch_size = 256 * 8 # 8 GPUs
FLAGS.loss_scale = 'dynamic'
FLAGS.tf_gpu_thread_mode = 'gpu_private'
FLAGS.dataset_num_private_threads = 40
self._run_and_report_benchmark()
def benchmark_xla_8_gpu_fp16(self):
......@@ -669,6 +1010,7 @@ class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
'benchmark_xla_8_gpu_fp16_tweaked_delay_measure')
FLAGS.batch_size = 256 * 8
FLAGS.tf_gpu_thread_mode = 'gpu_private'
FLAGS.datasets_num_private_threads = 48
FLAGS.train_steps = 310
self._run_and_report_benchmark()
......@@ -689,132 +1031,6 @@ class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
FLAGS.datasets_num_private_threads = 48
self._run_and_report_benchmark()
def benchmark_graph_8_gpu(self):
"""Test Keras model in legacy graph mode with 8 GPUs."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_8_gpu')
FLAGS.batch_size = 128 * 8 # 8 GPUs
self._run_and_report_benchmark()
def benchmark_graph_xla_8_gpu(self):
"""Test Keras model in legacy graph mode with XLA and 8 GPUs."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_xla_8_gpu')
FLAGS.batch_size = 128 * 8 # 8 GPUs
self._run_and_report_benchmark()
def benchmark_graph_8_gpu_fp16(self):
"""Test Keras model in legacy graph mode with 8 GPUs and fp16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_8_gpu_fp16')
FLAGS.batch_size = 256 * 8 # 8 GPUs
self._run_and_report_benchmark()
def benchmark_graph_xla_8_gpu_fp16(self):
"""Test Keras model in legacy graph mode with XLA, 8 GPUs and fp16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_xla_8_gpu_fp16')
FLAGS.batch_size = 256 * 8 # 8 GPUs
self._run_and_report_benchmark()
def benchmark_graph_8_gpu_fp16_tweaked(self):
"""Test Keras model in legacy graph mode, tuning, 8 GPUs, and FP16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_8_gpu_fp16_tweaked')
FLAGS.batch_size = 256 * 8 # 8 GPUs
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
def benchmark_graph_xla_8_gpu_fp16_tweaked(self):
"""Test Keras model in legacy graph tuning, XLA_FP16, 8 GPUs and fp16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir(
'benchmark_graph_xla_8_gpu_fp16_tweaked')
FLAGS.batch_size = 256 * 8 # 8 GPUs
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
def benchmark_graph_xla_8_gpu_fp16_tweaked_delay_measure(self):
"""Test in legacy graph mode with manual config tuning, XLA, 8 GPUs, fp16.
Delay performance measurement for stable performance on 96 vCPU platforms.
"""
self._setup()
FLAGS.num_gpus = 8
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir(
'benchmark_graph_xla_8_gpu_fp16_tweaked_delay_measure')
FLAGS.batch_size = 256 * 8
FLAGS.tf_gpu_thread_mode = 'gpu_private'
FLAGS.train_steps = 310
self._run_and_report_benchmark()
def benchmark_graph_8_gpu_fp16_dynamic_tweaked(self):
"""Test graph Keras with config tuning, 8 GPUs and dynamic fp16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir(
'benchmark_graph_8_gpu_fp16_dynamic_tweaked')
FLAGS.batch_size = 256 * 8 # 8 GPUs
FLAGS.loss_scale = 'dynamic'
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
def benchmark_graph_xla_8_gpu_fp16_dynamic_tweaked(self):
"""Test graph Keras with config tuning, XLA, 8 GPUs and dynamic fp16."""
self._setup()
FLAGS.num_gpus = 8
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'mirrored'
FLAGS.model_dir = self._get_model_dir(
'benchmark_graph_xla_8_gpu_fp16_dynamic_tweaked')
FLAGS.batch_size = 256 * 8 # 8 GPUs
FLAGS.loss_scale = 'dynamic'
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._run_and_report_benchmark()
def benchmark_2x2_tpu_fp16(self):
"""Test Keras model with 2x2 TPU, fp16."""
self._setup()
......@@ -842,34 +1058,30 @@ class Resnet50KerasBenchmarkBase(keras_benchmark.KerasBenchmark):
log_steps=FLAGS.log_steps)
class Resnet50KerasBenchmarkSynth(Resnet50KerasBenchmarkBase):
class Resnet50KerasBenchmarkSynth(Resnet50KerasClassifierBenchmarkBase):
"""Resnet50 synthetic benchmark tests."""
def __init__(self, output_dir=None, root_data_dir=None, tpu=None, **kwargs):
def_flags = {}
def_flags['skip_eval'] = True
def_flags['report_accuracy_metrics'] = False
def_flags['use_synthetic_data'] = True
def_flags['train_steps'] = 110
def_flags['log_steps'] = 10
super(Resnet50KerasBenchmarkSynth, self).__init__(
output_dir=output_dir, default_flags=def_flags, tpu=tpu)
output_dir=output_dir, default_flags=def_flags, tpu=tpu,
dataset_builder='synthetic', train_epochs=1, train_steps=110)
class Resnet50KerasBenchmarkReal(Resnet50KerasBenchmarkBase):
class Resnet50KerasBenchmarkReal(Resnet50KerasClassifierBenchmarkBase):
"""Resnet50 real data benchmark tests."""
def __init__(self, output_dir=None, root_data_dir=None, tpu=None, **kwargs):
data_dir = os.path.join(root_data_dir, 'imagenet')
def_flags = {}
def_flags['skip_eval'] = True
def_flags['report_accuracy_metrics'] = False
def_flags['data_dir'] = os.path.join(root_data_dir, 'imagenet')
def_flags['train_steps'] = 110
def_flags['log_steps'] = 10
super(Resnet50KerasBenchmarkReal, self).__init__(
output_dir=output_dir, default_flags=def_flags, tpu=tpu)
output_dir=output_dir, default_flags=def_flags, tpu=tpu,
dataset_builder='records', train_epochs=1, train_steps=110,
data_dir=data_dir)
class Resnet50KerasBenchmarkRemoteData(Resnet50KerasBenchmarkBase):
......@@ -970,19 +1182,6 @@ class Resnet50KerasBenchmarkRemoteData(Resnet50KerasBenchmarkBase):
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
def benchmark_graph_1_gpu_no_dist_strat(self):
"""Test Keras model in legacy graph mode with 1 GPU, no dist strat."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'off'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_1_gpu_no_dist_strat')
FLAGS.batch_size = 96 # BatchNorm is less efficient in legacy graph mode
# due to its reliance on v1 cond.
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
def benchmark_1_gpu(self):
"""Test Keras model with 1 GPU."""
self._setup()
......@@ -1108,74 +1307,6 @@ class Resnet50KerasBenchmarkRemoteData(Resnet50KerasBenchmarkBase):
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
def benchmark_graph_1_gpu(self):
"""Test Keras model in legacy graph mode with 1 GPU."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_1_gpu')
FLAGS.batch_size = 128
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
def benchmark_graph_xla_1_gpu(self):
"""Test Keras model in legacy graph mode with XLA and 1 GPU."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_xla_1_gpu')
FLAGS.batch_size = 128
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
def benchmark_graph_1_gpu_fp16(self):
"""Test Keras model in legacy graph mode with 1 GPU and fp16."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_1_gpu_fp16')
FLAGS.batch_size = 256
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
def benchmark_graph_xla_1_gpu_fp16(self):
"""Test Keras model in legacy graph mode with 1 GPU, fp16 and XLA."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.dtype = 'fp16'
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir('benchmark_graph_xla_1_gpu_fp16')
FLAGS.batch_size = 256
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
def benchmark_graph_xla_1_gpu_fp16_tweaked(self):
"""Test Keras model in legacy graph with 1 GPU, fp16, XLA, and tuning."""
self._setup()
FLAGS.num_gpus = 1
FLAGS.enable_eager = False
FLAGS.enable_xla = True
FLAGS.distribution_strategy = 'one_device'
FLAGS.model_dir = self._get_model_dir(
'benchmark_graph_xla_1_gpu_fp16_tweaked')
FLAGS.dtype = 'fp16'
FLAGS.batch_size = 256
FLAGS.tf_gpu_thread_mode = 'gpu_private'
self._override_flags_to_run_test_shorter()
self._run_and_report_benchmark()
@benchmark_wrappers.enable_runtime_flags
def _run_and_report_benchmark(self):
if FLAGS.num_gpus == 1 or FLAGS.run_eagerly:
......@@ -1245,7 +1376,7 @@ class Resnet50MultiWorkerKerasAccuracy(keras_benchmark.KerasBenchmark):
"""Resnet50 distributed accuracy tests with multiple workers."""
def __init__(self, output_dir=None, root_data_dir=None, **kwargs):
flag_methods = [resnet_imagenet_main.define_imagenet_keras_flags]
flag_methods = [classifier_trainer.define_imagenet_keras_flags]
self.data_dir = os.path.join(root_data_dir, 'imagenet')
super(Resnet50MultiWorkerKerasAccuracy, self).__init__(
output_dir=output_dir, flag_methods=flag_methods)
......@@ -1278,7 +1409,7 @@ class Resnet50MultiWorkerKerasAccuracy(keras_benchmark.KerasBenchmark):
top_1_min=MIN_TOP_1_ACCURACY,
top_1_max=MAX_TOP_1_ACCURACY):
start_time_sec = time.time()
stats = resnet_imagenet_main.run(flags.FLAGS)
stats = classifier_trainer.run(flags.FLAGS)
wall_time_sec = time.time() - start_time_sec
super(Resnet50MultiWorkerKerasAccuracy, self)._report_benchmark(
......
......@@ -23,12 +23,13 @@ from absl import flags
from absl import logging
import numpy as np
import tensorflow as tf
from official.benchmark.models import cifar_preprocessing
from official.benchmark.models import resnet_cifar_model
from official.benchmark.models import synthetic_util
from official.utils.flags import core as flags_core
from official.utils.logs import logger
from official.utils.misc import distribution_utils
from official.utils.misc import keras_utils
from official.vision.image_classification.resnet import cifar_preprocessing
from official.vision.image_classification.resnet import common
......@@ -159,7 +160,7 @@ def run(flags_obj):
strategy_scope = distribution_utils.get_strategy_scope(strategy)
if flags_obj.use_synthetic_data:
distribution_utils.set_up_synthetic_data()
synthetic_util.set_up_synthetic_data()
input_fn = common.get_synth_input_fn(
height=cifar_preprocessing.HEIGHT,
width=cifar_preprocessing.WIDTH,
......@@ -168,7 +169,7 @@ def run(flags_obj):
dtype=flags_core.get_tf_dtype(flags_obj),
drop_remainder=True)
else:
distribution_utils.undo_set_up_synthetic_data()
synthetic_util.undo_set_up_synthetic_data()
input_fn = cifar_preprocessing.input_fn
train_input_dataset = input_fn(
......
......@@ -24,10 +24,10 @@ import tensorflow as tf
from tensorflow.python.eager import context
from tensorflow.python.platform import googletest
from official.benchmark.models import cifar_preprocessing
from official.benchmark.models import resnet_cifar_main
from official.utils.misc import keras_utils
from official.utils.testing import integration
from official.vision.image_classification.resnet import cifar_preprocessing
class KerasCifarTest(googletest.TestCase):
......
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
......@@ -98,7 +98,6 @@ def run(flags_obj):
# pylint: disable=protected-access
if flags_obj.use_synthetic_data:
distribution_utils.set_up_synthetic_data()
input_fn = common.get_synth_input_fn(
height=imagenet_preprocessing.DEFAULT_IMAGE_SIZE,
width=imagenet_preprocessing.DEFAULT_IMAGE_SIZE,
......@@ -107,7 +106,6 @@ def run(flags_obj):
dtype=dtype,
drop_remainder=True)
else:
distribution_utils.undo_set_up_synthetic_data()
input_fn = imagenet_preprocessing.input_fn
# When `enable_xla` is True, we always drop the remainder of the batches
......
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Test the keras ResNet model with ImageNet data."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from absl.testing import parameterized
import tensorflow as tf
from tensorflow.python.eager import context
from official.benchmark.models import resnet_imagenet_main
from official.utils.misc import keras_utils
from official.utils.testing import integration
from official.vision.image_classification.resnet import imagenet_preprocessing
@parameterized.parameters(
"resnet",
# "resnet_polynomial_decay", b/151854314
"mobilenet",
# "mobilenet_polynomial_decay" b/151854314
)
class KerasImagenetTest(tf.test.TestCase):
"""Unit tests for Keras Models with ImageNet."""
_default_flags_dict = [
"-batch_size", "4",
"-train_steps", "1",
"-use_synthetic_data", "true",
"-data_format", "channels_last",
]
_extra_flags_dict = {
"resnet": [
"-model", "resnet50_v1.5",
"-optimizer", "resnet50_default",
],
"resnet_polynomial_decay": [
"-model", "resnet50_v1.5",
"-optimizer", "resnet50_default",
"-pruning_method", "polynomial_decay",
],
"mobilenet": [
"-model", "mobilenet",
"-optimizer", "mobilenet_default",
],
"mobilenet_polynomial_decay": [
"-model", "mobilenet",
"-optimizer", "mobilenet_default",
"-pruning_method", "polynomial_decay",
],
}
_tempdir = None
@classmethod
def setUpClass(cls): # pylint: disable=invalid-name
super(KerasImagenetTest, cls).setUpClass()
resnet_imagenet_main.define_imagenet_keras_flags()
def setUp(self):
super(KerasImagenetTest, self).setUp()
imagenet_preprocessing.NUM_IMAGES["validation"] = 4
self.policy = \
tf.compat.v2.keras.mixed_precision.experimental.global_policy()
def tearDown(self):
super(KerasImagenetTest, self).tearDown()
tf.io.gfile.rmtree(self.get_temp_dir())
tf.compat.v2.keras.mixed_precision.experimental.set_policy(self.policy)
def get_extra_flags_dict(self, flags_key):
return self._extra_flags_dict[flags_key] + self._default_flags_dict
def test_end_to_end_no_dist_strat(self, flags_key):
"""Test Keras model with 1 GPU, no distribution strategy."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
extra_flags = [
"-distribution_strategy", "off",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
def test_end_to_end_graph_no_dist_strat(self, flags_key):
"""Test Keras model in legacy graph mode with 1 GPU, no dist strat."""
extra_flags = [
"-enable_eager", "false",
"-distribution_strategy", "off",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
def test_end_to_end_1_gpu(self, flags_key):
"""Test Keras model with 1 GPU."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
if context.num_gpus() < 1:
self.skipTest(
"{} GPUs are not available for this test. {} GPUs are available".
format(1, context.num_gpus()))
extra_flags = [
"-num_gpus", "1",
"-distribution_strategy", "mirrored",
"-enable_checkpoint_and_export", "1",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
def test_end_to_end_1_gpu_fp16(self, flags_key):
"""Test Keras model with 1 GPU and fp16."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
if context.num_gpus() < 1:
self.skipTest(
"{} GPUs are not available for this test. {} GPUs are available"
.format(1, context.num_gpus()))
extra_flags = [
"-num_gpus", "1",
"-dtype", "fp16",
"-distribution_strategy", "mirrored",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
if "polynomial_decay" in extra_flags:
self.skipTest("Pruning with fp16 is not currently supported.")
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
def test_end_to_end_2_gpu(self, flags_key):
"""Test Keras model with 2 GPUs."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
if context.num_gpus() < 2:
self.skipTest(
"{} GPUs are not available for this test. {} GPUs are available".
format(2, context.num_gpus()))
extra_flags = [
"-num_gpus", "2",
"-distribution_strategy", "mirrored",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
def test_end_to_end_xla_2_gpu(self, flags_key):
"""Test Keras model with XLA and 2 GPUs."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
if context.num_gpus() < 2:
self.skipTest(
"{} GPUs are not available for this test. {} GPUs are available".
format(2, context.num_gpus()))
extra_flags = [
"-num_gpus", "2",
"-enable_xla", "true",
"-distribution_strategy", "mirrored",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
def test_end_to_end_2_gpu_fp16(self, flags_key):
"""Test Keras model with 2 GPUs and fp16."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
if context.num_gpus() < 2:
self.skipTest(
"{} GPUs are not available for this test. {} GPUs are available".
format(2, context.num_gpus()))
extra_flags = [
"-num_gpus", "2",
"-dtype", "fp16",
"-distribution_strategy", "mirrored",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
if "polynomial_decay" in extra_flags:
self.skipTest("Pruning with fp16 is not currently supported.")
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
def test_end_to_end_xla_2_gpu_fp16(self, flags_key):
"""Test Keras model with XLA, 2 GPUs and fp16."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
if context.num_gpus() < 2:
self.skipTest(
"{} GPUs are not available for this test. {} GPUs are available".
format(2, context.num_gpus()))
extra_flags = [
"-num_gpus", "2",
"-dtype", "fp16",
"-enable_xla", "true",
"-distribution_strategy", "mirrored",
]
extra_flags = extra_flags + self.get_extra_flags_dict(flags_key)
if "polynomial_decay" in extra_flags:
self.skipTest("Pruning with fp16 is not currently supported.")
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
if __name__ == "__main__":
tf.test.main()
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Test the keras ResNet model with ImageNet data on TPU."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from absl.testing import parameterized
import tensorflow as tf
from official.benchmark.models import resnet_imagenet_main
from official.utils.misc import keras_utils
from official.utils.testing import integration
from official.vision.image_classification.resnet import imagenet_preprocessing
class KerasImagenetTest(tf.test.TestCase, parameterized.TestCase):
"""Unit tests for Keras Models with ImageNet."""
_extra_flags_dict = {
"resnet": [
"-batch_size", "4",
"-train_steps", "1",
"-use_synthetic_data", "true"
"-model", "resnet50_v1.5",
"-optimizer", "resnet50_default",
],
"resnet_polynomial_decay": [
"-batch_size", "4",
"-train_steps", "1",
"-use_synthetic_data", "true",
"-model", "resnet50_v1.5",
"-optimizer", "resnet50_default",
"-pruning_method", "polynomial_decay",
],
}
_tempdir = None
@classmethod
def setUpClass(cls): # pylint: disable=invalid-name
super(KerasImagenetTest, cls).setUpClass()
resnet_imagenet_main.define_imagenet_keras_flags()
def setUp(self):
super(KerasImagenetTest, self).setUp()
imagenet_preprocessing.NUM_IMAGES["validation"] = 4
self.policy = \
tf.compat.v2.keras.mixed_precision.experimental.global_policy()
def tearDown(self):
super(KerasImagenetTest, self).tearDown()
tf.io.gfile.rmtree(self.get_temp_dir())
tf.compat.v2.keras.mixed_precision.experimental.set_policy(self.policy)
@parameterized.parameters([
"resnet",
# "resnet_polynomial_decay" b/151854314
])
def test_end_to_end_tpu(self, flags_key):
"""Test Keras model with TPU distribution strategy."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
extra_flags = [
"-distribution_strategy", "tpu",
"-data_format", "channels_last",
"-enable_checkpoint_and_export", "1",
]
extra_flags = extra_flags + self._extra_flags_dict[flags_key]
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
@parameterized.parameters(["resnet"])
def test_end_to_end_tpu_bf16(self, flags_key):
"""Test Keras model with TPU and bfloat16 activation."""
config = keras_utils.get_config_proto_v1()
tf.compat.v1.enable_eager_execution(config=config)
extra_flags = [
"-distribution_strategy", "tpu",
"-data_format", "channels_last",
"-dtype", "bf16",
]
extra_flags = extra_flags + self._extra_flags_dict[flags_key]
integration.run_synthetic(
main=resnet_imagenet_main.run,
tmp_root=self.get_temp_dir(),
extra_flags=extra_flags
)
if __name__ == "__main__":
tf.test.main()
......@@ -47,7 +47,6 @@ def define_flags():
epochs_between_evals=False,
stop_threshold=False,
num_gpu=True,
hooks=False,
export_dir=False,
run_eagerly=True,
distribution_strategy=True)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment