Unverified Commit 00fa8b12 authored by cclauss's avatar cclauss Committed by GitHub
Browse files

Merge branch 'master' into patch-13

parents 6d257a4f 1f34fcaf
official/* @nealwu @k-w-w /official/ @nealwu @k-w-w @karmel
research/adversarial_crypto/* @dave-andersen /research/adversarial_crypto/ @dave-andersen
research/adversarial_text/* @rsepassi /research/adversarial_text/ @rsepassi
research/adv_imagenet_models/* @AlexeyKurakin /research/adv_imagenet_models/ @AlexeyKurakin
research/attention_ocr/* @alexgorban /research/attention_ocr/ @alexgorban
research/audioset/* @plakal @dpwe /research/audioset/ @plakal @dpwe
research/autoencoders/* @snurkabill /research/autoencoders/ @snurkabill
research/cognitive_mapping_and_planning/* @s-gupta /research/brain_coder/ @danabo
research/compression/* @nmjohn /research/cognitive_mapping_and_planning/ @s-gupta
research/delf/* @andrefaraujo /research/compression/ @nmjohn
research/differential_privacy/* @panyx0718 /research/delf/ @andrefaraujo
research/domain_adaptation/* @bousmalis @ddohan /research/differential_privacy/ @panyx0718
research/im2txt/* @cshallue /research/domain_adaptation/ @bousmalis @dmrd
research/inception/* @shlens @vincentvanhoucke /research/gan/ @joel-shor
research/learned_optimizer/* @olganw @nirum /research/im2txt/ @cshallue
research/learning_to_remember_rare_events/* @lukaszkaiser @ofirnachum /research/inception/ @shlens @vincentvanhoucke
research/lfads/* @jazcollins @susillo /research/learned_optimizer/ @olganw @nirum
research/lm_1b/* @oriolvinyals @panyx0718 /research/learning_to_remember_rare_events/ @lukaszkaiser @ofirnachum
research/namignizer/* @knathanieltucker /research/lfads/ @jazcollins @susillo
research/neural_gpu/* @lukaszkaiser /research/lm_1b/ @oriolvinyals @panyx0718
research/neural_programmer/* @arvind2505 /research/namignizer/ @knathanieltucker
research/next_frame_prediction/* @panyx0718 /research/neural_gpu/ @lukaszkaiser
research/object_detection/* @jch1 @tombstone @derekjchow @jesu9 @dreamdragon /research/neural_programmer/ @arvind2505
research/pcl_rl/* @ofirnachum /research/next_frame_prediction/ @panyx0718
research/ptn/* @xcyan @arkanath @hellojas @honglaklee /research/object_detection/ @jch1 @tombstone @derekjchow @jesu9 @dreamdragon
research/real_nvp/* @laurent-dinh /research/pcl_rl/ @ofirnachum
research/rebar/* @gjtucker /research/ptn/ @xcyan @arkanath @hellojas @honglaklee
research/resnet/* @panyx0718 /research/real_nvp/ @laurent-dinh
research/skip_thoughts/* @cshallue /research/rebar/ @gjtucker
research/slim/* @sguada @nathansilberman /research/resnet/ @panyx0718
research/street/* @theraysmith /research/skip_thoughts/ @cshallue
research/swivel/* @waterson /research/slim/ @sguada @nathansilberman
research/syntaxnet/* @calberti @andorardo @bogatyy @markomernick /research/street/ @theraysmith
research/textsum/* @panyx0718 @peterjliu /research/swivel/ @waterson
research/transformer/* @daviddao /research/syntaxnet/ @calberti @andorardo @bogatyy @markomernick
research/video_prediction/* @cbfinn /research/tcn/ @coreylynch @sermanet
samples/* @MarkDaoust /research/textsum/ @panyx0718 @peterjliu
tutorials/embedding/* @zffchen78 @a-dai /research/transformer/ @daviddao
tutorials/image/* @sherrym @shlens /research/video_prediction/ @cbfinn
tutorials/rnn/* @lukaszkaiser @ebrevdo /research/fivo/ @dieterichlawson
/samples/ @MarkDaoust
/samples/languages/java/ @asimshankar
/tutorials/embedding/ @zffchen78 @a-dai
/tutorials/image/ @sherrym @shlens
/tutorials/image/cifar10_estimator/ @tfboyd @protoget
/tutorials/rnn/ @lukaszkaiser @ebrevdo
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
This directory builds a convolutional neural net to classify the [MNIST This directory builds a convolutional neural net to classify the [MNIST
dataset](http://yann.lecun.com/exdb/mnist/) using the dataset](http://yann.lecun.com/exdb/mnist/) using the
[tf.contrib.data](https://www.tensorflow.org/api_docs/python/tf/contrib/data), [tf.data](https://www.tensorflow.org/api_docs/python/tf/data),
[tf.estimator.Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator), [tf.estimator.Estimator](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator),
and and
[tf.layers](https://www.tensorflow.org/api_docs/python/tf/layers) [tf.layers](https://www.tensorflow.org/api_docs/python/tf/layers)
...@@ -12,18 +12,51 @@ APIs. ...@@ -12,18 +12,51 @@ APIs.
## Setup ## Setup
To begin, you'll simply need the latest version of TensorFlow installed. To begin, you'll simply need the latest version of TensorFlow installed.
Then to train the model, run the following:
First convert the MNIST data to TFRecord file format by running the following: ```
python mnist.py
```
The model will begin training and will automatically evaluate itself on the
validation data.
Illustrative unit tests and benchmarks can be run with:
``` ```
python convert_to_records.py python mnist_test.py
python mnist_test.py --benchmarks=.
``` ```
Then to train the model, run the following: ## Exporting the model
You can export the model into Tensorflow [SavedModel](https://www.tensorflow.org/programmers_guide/saved_model) format by using the argument `--export_dir`:
``` ```
python mnist.py python mnist.py --export_dir /tmp/mnist_saved_model
```
The SavedModel will be saved in a timestamped directory under `/tmp/mnist_saved_model/` (e.g. `/tmp/mnist_saved_model/1513630966/`).
**Getting predictions with SavedModel**
Use [`saved_model_cli`](https://www.tensorflow.org/programmers_guide/saved_model#cli_to_inspect_and_execute_savedmodel) to inspect and execute the SavedModel.
```
saved_model_cli run --dir /tmp/mnist_saved_model/TIMESTAMP --tag_set serve --signature_def classify --inputs image=examples.npy
```
`examples.npy` contains the data from `example5.png` and `example3.png` in a numpy array, in that order. The array values are normalized to values between 0 and 1.
The output should look similar to below:
```
Result for output key classes:
[5 3]
Result for output key probabilities:
[[ 1.53558474e-07 1.95694142e-13 1.31193523e-09 5.47467265e-03
5.85711526e-22 9.94520664e-01 3.48423509e-06 2.65365645e-17
9.78631419e-07 3.15522470e-08]
[ 1.22413359e-04 5.87615965e-08 1.72251271e-06 9.39960718e-01
3.30306928e-11 2.87386645e-02 2.82353517e-02 8.21146413e-18
2.52568233e-03 4.15460236e-04]]
``` ```
The model will begin training and will automatically evaluate itself on the
validation data.
# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Converts MNIST data to TFRecords file format with Example protos.
To read about optimizations that can be applied to the input preprocessing
stage, see: https://www.tensorflow.org/performance/performance_guide#input_pipeline_optimization.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import os
import sys
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets import mnist
parser = argparse.ArgumentParser()
parser.add_argument('--directory', type=str, default='/tmp/mnist_data',
help='Directory to download data files and write the '
'converted result.')
parser.add_argument('--validation_size', type=int, default=0,
help='Number of examples to separate from the training '
'data for the validation set.')
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def convert_to(dataset, name, directory):
"""Converts a dataset to TFRecords."""
images = dataset.images
labels = dataset.labels
num_examples = dataset.num_examples
if images.shape[0] != num_examples:
raise ValueError('Images size %d does not match label size %d.' %
(images.shape[0], num_examples))
rows = images.shape[1]
cols = images.shape[2]
depth = images.shape[3]
filename = os.path.join(directory, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
image_raw = images[index].tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'height': _int64_feature(rows),
'width': _int64_feature(cols),
'depth': _int64_feature(depth),
'label': _int64_feature(int(labels[index])),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
writer.close()
def main(unused_argv):
# Get the data.
datasets = mnist.read_data_sets(FLAGS.directory,
dtype=tf.uint8,
reshape=False,
validation_size=FLAGS.validation_size)
# Convert to Examples and write the result to TFRecords.
convert_to(datasets.train, 'train', FLAGS.directory)
convert_to(datasets.validation, 'validation', FLAGS.directory)
convert_to(datasets.test, 'test', FLAGS.directory)
if __name__ == '__main__':
tf.logging.set_verbosity(tf.logging.INFO)
FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""tf.data.Dataset interface to the MNIST dataset."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import shutil
import gzip
import numpy as np
from six.moves import urllib
import tensorflow as tf
def read32(bytestream):
"""Read 4 bytes from bytestream as an unsigned 32-bit integer."""
dt = np.dtype(np.uint32).newbyteorder('>')
return np.frombuffer(bytestream.read(4), dtype=dt)[0]
def check_image_file_header(filename):
"""Validate that filename corresponds to images for the MNIST dataset."""
with tf.gfile.Open(filename, 'rb') as f:
magic = read32(f)
num_images = read32(f)
rows = read32(f)
cols = read32(f)
if magic != 2051:
raise ValueError('Invalid magic number %d in MNIST file %s' % (magic,
f.name))
if rows != 28 or cols != 28:
raise ValueError(
'Invalid MNIST file %s: Expected 28x28 images, found %dx%d' %
(f.name, rows, cols))
def check_labels_file_header(filename):
"""Validate that filename corresponds to labels for the MNIST dataset."""
with tf.gfile.Open(filename, 'rb') as f:
magic = read32(f)
num_items = read32(f)
if magic != 2049:
raise ValueError('Invalid magic number %d in MNIST file %s' % (magic,
f.name))
def download(directory, filename):
"""Download (and unzip) a file from the MNIST dataset if not already done."""
filepath = os.path.join(directory, filename)
if tf.gfile.Exists(filepath):
return filepath
if not tf.gfile.Exists(directory):
tf.gfile.MakeDirs(directory)
# CVDF mirror of http://yann.lecun.com/exdb/mnist/
url = 'https://storage.googleapis.com/cvdf-datasets/mnist/' + filename + '.gz'
zipped_filepath = filepath + '.gz'
print('Downloading %s to %s' % (url, zipped_filepath))
urllib.request.urlretrieve(url, zipped_filepath)
with gzip.open(zipped_filepath, 'rb') as f_in, open(filepath, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
os.remove(zipped_filepath)
return filepath
def dataset(directory, images_file, labels_file):
images_file = download(directory, images_file)
labels_file = download(directory, labels_file)
check_image_file_header(images_file)
check_labels_file_header(labels_file)
def decode_image(image):
# Normalize from [0, 255] to [0.0, 1.0]
image = tf.decode_raw(image, tf.uint8)
image = tf.cast(image, tf.float32)
image = tf.reshape(image, [784])
return image / 255.0
def one_hot_label(label):
label = tf.decode_raw(label, tf.uint8) # tf.string -> tf.uint8
label = tf.reshape(label, []) # label is a scalar
return tf.one_hot(label, 10)
images = tf.data.FixedLengthRecordDataset(
images_file, 28 * 28, header_bytes=16).map(decode_image)
labels = tf.data.FixedLengthRecordDataset(
labels_file, 1, header_bytes=8).map(one_hot_label)
return tf.data.Dataset.zip((images, labels))
def train(directory):
"""tf.data.Dataset object for MNIST training data."""
return dataset(directory, 'train-images-idx3-ubyte',
'train-labels-idx1-ubyte')
def test(directory):
"""tf.data.Dataset object for MNIST test data."""
return dataset(directory, 't10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte')
...@@ -22,228 +22,186 @@ import os ...@@ -22,228 +22,186 @@ import os
import sys import sys
import tensorflow as tf import tensorflow as tf
import dataset
parser = argparse.ArgumentParser()
# Basic model parameters. class Model(object):
parser.add_argument('--batch_size', type=int, default=100, """Class that defines a graph to recognize digits in the MNIST dataset."""
help='Number of images to process in a batch')
def __init__(self, data_format):
parser.add_argument('--data_dir', type=str, default='/tmp/mnist_data', """Creates a model for classifying a hand-written digit.
help='Path to the MNIST data directory.')
Args:
parser.add_argument('--model_dir', type=str, default='/tmp/mnist_model', data_format: Either 'channels_first' or 'channels_last'.
help='The directory where the model will be stored.') 'channels_first' is typically faster on GPUs while 'channels_last' is
typically faster on CPUs. See
parser.add_argument('--train_epochs', type=int, default=40, https://www.tensorflow.org/performance/performance_guide#data_formats
help='Number of epochs to train.') """
if data_format == 'channels_first':
parser.add_argument( self._input_shape = [-1, 1, 28, 28]
'--data_format', type=str, default=None, else:
choices=['channels_first', 'channels_last'], assert data_format == 'channels_last'
help='A flag to override the data format used in the model. channels_first ' self._input_shape = [-1, 28, 28, 1]
'provides a performance boost on GPU but is not always compatible '
'with CPU. If left unspecified, the data format will be chosen ' self.conv1 = tf.layers.Conv2D(
'automatically based on whether TensorFlow was built for CPU or GPU.') 32, 5, padding='same', data_format=data_format, activation=tf.nn.relu)
self.conv2 = tf.layers.Conv2D(
_NUM_IMAGES = { 64, 5, padding='same', data_format=data_format, activation=tf.nn.relu)
'train': 50000, self.fc1 = tf.layers.Dense(1024, activation=tf.nn.relu)
'validation': 10000, self.fc2 = tf.layers.Dense(10)
} self.dropout = tf.layers.Dropout(0.4)
self.max_pool2d = tf.layers.MaxPooling2D(
(2, 2), (2, 2), padding='same', data_format=data_format)
def input_fn(is_training, filename, batch_size=1, num_epochs=1):
"""A simple input_fn using the tf.data input pipeline.""" def __call__(self, inputs, training):
"""Add operations to classify a batch of input images.
def example_parser(serialized_example):
"""Parses a single tf.Example into image and label tensors.""" Args:
features = tf.parse_single_example( inputs: A Tensor representing a batch of input images.
serialized_example, training: A boolean. Set to True to add operations required only when
features={ training the classifier.
'image_raw': tf.FixedLenFeature([], tf.string),
'label': tf.FixedLenFeature([], tf.int64), Returns:
}) A logits Tensor with shape [<batch_size>, 10].
image = tf.decode_raw(features['image_raw'], tf.uint8) """
image.set_shape([28 * 28]) y = tf.reshape(inputs, self._input_shape)
y = self.conv1(y)
# Normalize the values of the image from the range [0, 255] to [-0.5, 0.5] y = self.max_pool2d(y)
image = tf.cast(image, tf.float32) / 255 - 0.5 y = self.conv2(y)
label = tf.cast(features['label'], tf.int32) y = self.max_pool2d(y)
return image, tf.one_hot(label, 10) y = tf.layers.flatten(y)
y = self.fc1(y)
dataset = tf.data.TFRecordDataset([filename]) y = self.dropout(y, training=training)
return self.fc2(y)
# Apply dataset transformations
if is_training:
# When choosing shuffle buffer sizes, larger sizes result in better def model_fn(features, labels, mode, params):
# randomness, while smaller sizes have better performance. Because MNIST is """The model_fn argument for creating an Estimator."""
# a small dataset, we can easily shuffle the full epoch. model = Model(params['data_format'])
dataset = dataset.shuffle(buffer_size=_NUM_IMAGES['train']) image = features
if isinstance(image, dict):
# We call repeat after shuffling, rather than before, to prevent separate image = features['image']
# epochs from blending together.
dataset = dataset.repeat(num_epochs)
# Map example_parser over dataset, and batch results by up to batch_size
dataset = dataset.map(example_parser).prefetch(batch_size)
dataset = dataset.batch(batch_size)
iterator = dataset.make_one_shot_iterator()
images, labels = iterator.get_next()
return images, labels
def mnist_model(inputs, mode, data_format):
"""Takes the MNIST inputs and mode and outputs a tensor of logits."""
# Input Layer
# Reshape X to 4-D tensor: [batch_size, width, height, channels]
# MNIST images are 28x28 pixels, and have one color channel
inputs = tf.reshape(inputs, [-1, 28, 28, 1])
if data_format is None:
# When running on GPU, transpose the data from channels_last (NHWC) to
# channels_first (NCHW) to improve performance.
# See https://www.tensorflow.org/performance/performance_guide#data_formats
data_format = ('channels_first' if tf.test.is_built_with_cuda() else
'channels_last')
if data_format == 'channels_first':
inputs = tf.transpose(inputs, [0, 3, 1, 2])
# Convolutional Layer #1
# Computes 32 features using a 5x5 filter with ReLU activation.
# Padding is added to preserve width and height.
# Input Tensor Shape: [batch_size, 28, 28, 1]
# Output Tensor Shape: [batch_size, 28, 28, 32]
conv1 = tf.layers.conv2d(
inputs=inputs,
filters=32,
kernel_size=[5, 5],
padding='same',
activation=tf.nn.relu,
data_format=data_format)
# Pooling Layer #1
# First max pooling layer with a 2x2 filter and stride of 2
# Input Tensor Shape: [batch_size, 28, 28, 32]
# Output Tensor Shape: [batch_size, 14, 14, 32]
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2,
data_format=data_format)
# Convolutional Layer #2
# Computes 64 features using a 5x5 filter.
# Padding is added to preserve width and height.
# Input Tensor Shape: [batch_size, 14, 14, 32]
# Output Tensor Shape: [batch_size, 14, 14, 64]
conv2 = tf.layers.conv2d(
inputs=pool1,
filters=64,
kernel_size=[5, 5],
padding='same',
activation=tf.nn.relu,
data_format=data_format)
# Pooling Layer #2
# Second max pooling layer with a 2x2 filter and stride of 2
# Input Tensor Shape: [batch_size, 14, 14, 64]
# Output Tensor Shape: [batch_size, 7, 7, 64]
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2,
data_format=data_format)
# Flatten tensor into a batch of vectors
# Input Tensor Shape: [batch_size, 7, 7, 64]
# Output Tensor Shape: [batch_size, 7 * 7 * 64]
pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
# Dense Layer
# Densely connected layer with 1024 neurons
# Input Tensor Shape: [batch_size, 7 * 7 * 64]
# Output Tensor Shape: [batch_size, 1024]
dense = tf.layers.dense(inputs=pool2_flat, units=1024,
activation=tf.nn.relu)
# Add dropout operation; 0.6 probability that element will be kept
dropout = tf.layers.dropout(
inputs=dense, rate=0.4, training=(mode == tf.estimator.ModeKeys.TRAIN))
# Logits layer
# Input Tensor Shape: [batch_size, 1024]
# Output Tensor Shape: [batch_size, 10]
logits = tf.layers.dense(inputs=dropout, units=10)
return logits
def mnist_model_fn(features, labels, mode, params):
"""Model function for MNIST."""
logits = mnist_model(features, mode, params['data_format'])
predictions = {
'classes': tf.argmax(input=logits, axis=1),
'probabilities': tf.nn.softmax(logits, name='softmax_tensor')
}
if mode == tf.estimator.ModeKeys.PREDICT: if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions) logits = model(image, training=False)
predictions = {
loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits) 'classes': tf.argmax(logits, axis=1),
'probabilities': tf.nn.softmax(logits),
# Configure the training op }
return tf.estimator.EstimatorSpec(
mode=tf.estimator.ModeKeys.PREDICT,
predictions=predictions,
export_outputs={
'classify': tf.estimator.export.PredictOutput(predictions)
})
if mode == tf.estimator.ModeKeys.TRAIN: if mode == tf.estimator.ModeKeys.TRAIN:
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4) optimizer = tf.train.AdamOptimizer(learning_rate=1e-4)
train_op = optimizer.minimize(loss, tf.train.get_or_create_global_step()) logits = model(image, training=True)
else: loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
train_op = None accuracy = tf.metrics.accuracy(
labels=tf.argmax(labels, axis=1), predictions=tf.argmax(logits, axis=1))
accuracy = tf.metrics.accuracy( # Name the accuracy tensor 'train_accuracy' to demonstrate the
tf.argmax(labels, axis=1), predictions['classes']) # LoggingTensorHook.
metrics = {'accuracy': accuracy} tf.identity(accuracy[1], name='train_accuracy')
tf.summary.scalar('train_accuracy', accuracy[1])
# Create a tensor named train_accuracy for logging purposes return tf.estimator.EstimatorSpec(
tf.identity(accuracy[1], name='train_accuracy') mode=tf.estimator.ModeKeys.TRAIN,
tf.summary.scalar('train_accuracy', accuracy[1]) loss=loss,
train_op=optimizer.minimize(loss, tf.train.get_or_create_global_step()))
return tf.estimator.EstimatorSpec( if mode == tf.estimator.ModeKeys.EVAL:
mode=mode, logits = model(image, training=False)
predictions=predictions, loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
loss=loss, return tf.estimator.EstimatorSpec(
train_op=train_op, mode=tf.estimator.ModeKeys.EVAL,
eval_metric_ops=metrics) loss=loss,
eval_metric_ops={
'accuracy':
tf.metrics.accuracy(
labels=tf.argmax(labels, axis=1),
predictions=tf.argmax(logits, axis=1)),
})
def main(unused_argv): def main(unused_argv):
# Make sure that training and testing data have been converted. data_format = FLAGS.data_format
train_file = os.path.join(FLAGS.data_dir, 'train.tfrecords') if data_format is None:
test_file = os.path.join(FLAGS.data_dir, 'test.tfrecords') data_format = ('channels_first'
assert (tf.gfile.Exists(train_file) and tf.gfile.Exists(test_file)), ( if tf.test.is_built_with_cuda() else 'channels_last')
'Run convert_to_records.py first to convert the MNIST data to TFRecord '
'file format.')
# Create the Estimator
mnist_classifier = tf.estimator.Estimator( mnist_classifier = tf.estimator.Estimator(
model_fn=mnist_model_fn, model_dir=FLAGS.model_dir, model_fn=model_fn,
params={'data_format': FLAGS.data_format}) model_dir=FLAGS.model_dir,
params={
'data_format': data_format
})
# Train the model
def train_input_fn():
# When choosing shuffle buffer sizes, larger sizes result in better
# randomness, while smaller sizes use less memory. MNIST is a small
# enough dataset that we can easily shuffle the full epoch.
ds = dataset.train(FLAGS.data_dir)
ds = ds.cache().shuffle(buffer_size=50000).batch(FLAGS.batch_size).repeat(
FLAGS.train_epochs)
(images, labels) = ds.make_one_shot_iterator().get_next()
return (images, labels)
# Set up training hook that logs the training accuracy every 100 steps. # Set up training hook that logs the training accuracy every 100 steps.
tensors_to_log = { tensors_to_log = {'train_accuracy': 'train_accuracy'}
'train_accuracy': 'train_accuracy'
}
logging_hook = tf.train.LoggingTensorHook( logging_hook = tf.train.LoggingTensorHook(
tensors=tensors_to_log, every_n_iter=100) tensors=tensors_to_log, every_n_iter=100)
mnist_classifier.train(input_fn=train_input_fn, hooks=[logging_hook])
# Train the model
mnist_classifier.train(
input_fn=lambda: input_fn(
True, train_file, FLAGS.batch_size, FLAGS.train_epochs),
hooks=[logging_hook])
# Evaluate the model and print results # Evaluate the model and print results
eval_results = mnist_classifier.evaluate( def eval_input_fn():
input_fn=lambda: input_fn(False, test_file, FLAGS.batch_size)) return dataset.test(FLAGS.data_dir).batch(
FLAGS.batch_size).make_one_shot_iterator().get_next()
eval_results = mnist_classifier.evaluate(input_fn=eval_input_fn)
print() print()
print('Evaluation results:\n\t%s' % eval_results) print('Evaluation results:\n\t%s' % eval_results)
# Export the model
if FLAGS.export_dir is not None:
image = tf.placeholder(tf.float32, [None, 28, 28])
input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn({
'image': image,
})
mnist_classifier.export_savedmodel(FLAGS.export_dir, input_fn)
if __name__ == '__main__': if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'--batch_size',
type=int,
default=100,
help='Number of images to process in a batch')
parser.add_argument(
'--data_dir',
type=str,
default='/tmp/mnist_data',
help='Path to directory containing the MNIST dataset')
parser.add_argument(
'--model_dir',
type=str,
default='/tmp/mnist_model',
help='The directory where the model will be stored.')
parser.add_argument(
'--train_epochs', type=int, default=40, help='Number of epochs to train.')
parser.add_argument(
'--data_format',
type=str,
default=None,
choices=['channels_first', 'channels_last'],
help='A flag to override the data format used in the model. channels_first '
'provides a performance boost on GPU but is not always compatible '
'with CPU. If left unspecified, the data format will be chosen '
'automatically based on whether TensorFlow was built for CPU or GPU.')
parser.add_argument(
'--export_dir',
type=str,
help='The directory where the exported SavedModel will be stored.')
tf.logging.set_verbosity(tf.logging.INFO) tf.logging.set_verbosity(tf.logging.INFO)
FLAGS, unparsed = parser.parse_known_args() FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
...@@ -18,30 +18,63 @@ from __future__ import division ...@@ -18,30 +18,63 @@ from __future__ import division
from __future__ import print_function from __future__ import print_function
import tensorflow as tf import tensorflow as tf
import time
import mnist import mnist
tf.logging.set_verbosity(tf.logging.ERROR) BATCH_SIZE = 100
class BaseTest(tf.test.TestCase): def dummy_input_fn():
image = tf.random_uniform([BATCH_SIZE, 784])
labels = tf.random_uniform([BATCH_SIZE], maxval=9, dtype=tf.int32)
return image, tf.one_hot(labels, 10)
def input_fn(self):
features = tf.random_uniform([55000, 784]) def make_estimator():
labels = tf.random_uniform([55000], maxval=9, dtype=tf.int32) data_format = 'channels_last'
return features, tf.one_hot(labels, 10) if tf.test.is_built_with_cuda():
data_format = 'channels_first'
return tf.estimator.Estimator(
model_fn=mnist.model_fn, params={
'data_format': data_format
})
class Tests(tf.test.TestCase):
def test_mnist(self):
classifier = make_estimator()
classifier.train(input_fn=dummy_input_fn, steps=2)
eval_results = classifier.evaluate(input_fn=dummy_input_fn, steps=1)
loss = eval_results['loss']
global_step = eval_results['global_step']
accuracy = eval_results['accuracy']
self.assertEqual(loss.shape, ())
self.assertEqual(2, global_step)
self.assertEqual(accuracy.shape, ())
input_fn = lambda: tf.random_uniform([3, 784])
predictions_generator = classifier.predict(input_fn)
for i in range(3):
predictions = next(predictions_generator)
self.assertEqual(predictions['probabilities'].shape, (10,))
self.assertEqual(predictions['classes'].shape, ())
def mnist_model_fn_helper(self, mode): def mnist_model_fn_helper(self, mode):
features, labels = self.input_fn() features, labels = dummy_input_fn()
image_count = features.shape[0] image_count = features.shape[0]
spec = mnist.mnist_model_fn( spec = mnist.model_fn(features, labels, mode, {
features, labels, mode, {'data_format': 'channels_last'}) 'data_format': 'channels_last'
})
predictions = spec.predictions if mode == tf.estimator.ModeKeys.PREDICT:
self.assertAllEqual(predictions['probabilities'].shape, (image_count, 10)) predictions = spec.predictions
self.assertEqual(predictions['probabilities'].dtype, tf.float32) self.assertAllEqual(predictions['probabilities'].shape, (image_count, 10))
self.assertAllEqual(predictions['classes'].shape, (image_count,)) self.assertEqual(predictions['probabilities'].dtype, tf.float32)
self.assertEqual(predictions['classes'].dtype, tf.int64) self.assertAllEqual(predictions['classes'].shape, (image_count,))
self.assertEqual(predictions['classes'].dtype, tf.int64)
if mode != tf.estimator.ModeKeys.PREDICT: if mode != tf.estimator.ModeKeys.PREDICT:
loss = spec.loss loss = spec.loss
...@@ -65,5 +98,31 @@ class BaseTest(tf.test.TestCase): ...@@ -65,5 +98,31 @@ class BaseTest(tf.test.TestCase):
self.mnist_model_fn_helper(tf.estimator.ModeKeys.PREDICT) self.mnist_model_fn_helper(tf.estimator.ModeKeys.PREDICT)
class Benchmarks(tf.test.Benchmark):
def benchmark_train_step_time(self):
classifier = make_estimator()
# Run one step to warmup any use of the GPU.
classifier.train(input_fn=dummy_input_fn, steps=1)
have_gpu = tf.test.is_gpu_available()
num_steps = 1000 if have_gpu else 100
name = 'train_step_time_%s' % ('gpu' if have_gpu else 'cpu')
start = time.time()
classifier.train(input_fn=dummy_input_fn, steps=num_steps)
end = time.time()
wall_time = (end - start) / num_steps
self.report_benchmark(
iters=num_steps,
wall_time=wall_time,
name=name,
extras={
'examples_per_sec': BATCH_SIZE / wall_time
})
if __name__ == '__main__': if __name__ == '__main__':
tf.logging.set_verbosity(tf.logging.ERROR)
tf.test.main() tf.test.main()
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""MNIST model training using TPUs.
This program demonstrates training of the convolutional neural network model
defined in mnist.py on Google Cloud TPUs (https://cloud.google.com/tpu/).
If you are not interested in TPUs, you should ignore this file.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import dataset
import mnist
tf.flags.DEFINE_string("data_dir", "",
"Path to directory containing the MNIST dataset")
tf.flags.DEFINE_string("model_dir", None, "Estimator model_dir")
tf.flags.DEFINE_integer("batch_size", 1024,
"Mini-batch size for the training. Note that this "
"is the global batch size and not the per-shard batch.")
tf.flags.DEFINE_integer("train_steps", 1000, "Total number of training steps.")
tf.flags.DEFINE_integer("eval_steps", 0,
"Total number of evaluation steps. If `0`, evaluation "
"after training is skipped.")
tf.flags.DEFINE_float("learning_rate", 0.05, "Learning rate.")
tf.flags.DEFINE_bool("use_tpu", True, "Use TPUs rather than plain CPUs")
tf.flags.DEFINE_string("master", "local", "GRPC URL of the Cloud TPU instance.")
tf.flags.DEFINE_integer("iterations", 50,
"Number of iterations per TPU training loop.")
tf.flags.DEFINE_integer("num_shards", 8, "Number of shards (TPU chips).")
FLAGS = tf.flags.FLAGS
def metric_fn(labels, logits):
accuracy = tf.metrics.accuracy(
labels=tf.argmax(labels, axis=1), predictions=tf.argmax(logits, axis=1))
return {"accuracy": accuracy}
def model_fn(features, labels, mode, params):
del params
if mode == tf.estimator.ModeKeys.PREDICT:
raise RuntimeError("mode {} is not supported yet".format(mode))
image = features
if isinstance(image, dict):
image = features["image"]
model = mnist.Model("channels_last")
logits = model(image, training=(mode == tf.estimator.ModeKeys.TRAIN))
loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
if mode == tf.estimator.ModeKeys.TRAIN:
learning_rate = tf.train.exponential_decay(
FLAGS.learning_rate,
tf.train.get_global_step(),
decay_steps=100000,
decay_rate=0.96)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
if FLAGS.use_tpu:
optimizer = tf.contrib.tpu.CrossShardOptimizer(optimizer)
return tf.contrib.tpu.TPUEstimatorSpec(
mode=mode,
loss=loss,
train_op=optimizer.minimize(loss, tf.train.get_global_step()))
if mode == tf.estimator.ModeKeys.EVAL:
return tf.contrib.tpu.TPUEstimatorSpec(
mode=mode, loss=loss, eval_metrics=(metric_fn, [labels, logits]))
def train_input_fn(params):
batch_size = params["batch_size"]
data_dir = params["data_dir"]
# Retrieves the batch size for the current shard. The # of shards is
# computed according to the input pipeline deployment. See
# `tf.contrib.tpu.RunConfig` for details.
ds = dataset.train(data_dir).cache().repeat().shuffle(
buffer_size=50000).apply(
tf.contrib.data.batch_and_drop_remainder(batch_size))
images, labels = ds.make_one_shot_iterator().get_next()
return images, labels
def eval_input_fn(params):
batch_size = params["batch_size"]
data_dir = params["data_dir"]
ds = dataset.test(data_dir).apply(
tf.contrib.data.batch_and_drop_remainder(batch_size))
images, labels = ds.make_one_shot_iterator().get_next()
return images, labels
def main(argv):
del argv # Unused.
tf.logging.set_verbosity(tf.logging.INFO)
run_config = tf.contrib.tpu.RunConfig(
master=FLAGS.master,
evaluation_master=FLAGS.master,
model_dir=FLAGS.model_dir,
session_config=tf.ConfigProto(
allow_soft_placement=True, log_device_placement=True),
tpu_config=tf.contrib.tpu.TPUConfig(FLAGS.iterations, FLAGS.num_shards),
)
estimator = tf.contrib.tpu.TPUEstimator(
model_fn=model_fn,
use_tpu=FLAGS.use_tpu,
train_batch_size=FLAGS.batch_size,
eval_batch_size=FLAGS.batch_size,
params={"data_dir": FLAGS.data_dir},
config=run_config)
# TPUEstimator.train *requires* a max_steps argument.
estimator.train(input_fn=train_input_fn, max_steps=FLAGS.train_steps)
# TPUEstimator.evaluate *requires* a steps argument.
# Note that the number of examples used during evaluation is
# --eval_steps * --batch_size.
# So if you change --batch_size then change --eval_steps too.
if FLAGS.eval_steps:
estimator.evaluate(input_fn=eval_input_fn, steps=FLAGS.eval_steps)
if __name__ == "__main__":
tf.app.run()
...@@ -46,3 +46,6 @@ python imagenet_main.py --data_dir=/path/to/imagenet ...@@ -46,3 +46,6 @@ python imagenet_main.py --data_dir=/path/to/imagenet
The model will begin training and will automatically evaluate itself on the validation data roughly once per epoch. The model will begin training and will automatically evaluate itself on the validation data roughly once per epoch.
Note that there are a number of other options you can specify, including `--model_dir` to choose where to store the model and `--resnet_size` to choose the model size (options include ResNet-18 through ResNet-200). See [`imagenet_main.py`](imagenet_main.py) for the full list of options. Note that there are a number of other options you can specify, including `--model_dir` to choose where to store the model and `--resnet_size` to choose the model size (options include ResNet-18 through ResNet-200). See [`imagenet_main.py`](imagenet_main.py) for the full list of options.
### Pre-trained model
You can download a 190 MB pre-trained version of ResNet-50 achieving 75.3% top-1 single-crop accuracy here: [resnet50_2017_11_30.tar.gz](http://download.tensorflow.org/models/official/resnet50_2017_11_30.tar.gz). Simply download and uncompress the file, and point the model to the extracted directory using the `--model_dir` flag.
...@@ -12,7 +12,7 @@ ...@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
# ============================================================================== # ==============================================================================
"""Example code for TensorFlow Wide & Deep Tutorial using TF.Learn API.""" """Example code for TensorFlow Wide & Deep Tutorial using tf.estimator API."""
from __future__ import absolute_import from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
......
...@@ -18,8 +18,9 @@ installation](https://www.tensorflow.org/install). ...@@ -18,8 +18,9 @@ installation](https://www.tensorflow.org/install).
- [attention_ocr](attention_ocr): a model for real-world image text - [attention_ocr](attention_ocr): a model for real-world image text
extraction. extraction.
- [audioset](audioset): Models and supporting code for use with - [audioset](audioset): Models and supporting code for use with
[AudioSet](http://g.co.audioset). [AudioSet](http://g.co/audioset).
- [autoencoder](autoencoder): various autoencoders. - [autoencoder](autoencoder): various autoencoders.
- [brain_coder](brain_coder): Program synthesis with reinforcement learning.
- [cognitive_mapping_and_planning](cognitive_mapping_and_planning): - [cognitive_mapping_and_planning](cognitive_mapping_and_planning):
implementation of a spatial memory based mapping and planning architecture implementation of a spatial memory based mapping and planning architecture
for visual navigation. for visual navigation.
...@@ -61,6 +62,7 @@ installation](https://www.tensorflow.org/install). ...@@ -61,6 +62,7 @@ installation](https://www.tensorflow.org/install).
using a Deep RNN. using a Deep RNN.
- [swivel](swivel): the Swivel algorithm for generating word embeddings. - [swivel](swivel): the Swivel algorithm for generating word embeddings.
- [syntaxnet](syntaxnet): neural models of natural language syntax. - [syntaxnet](syntaxnet): neural models of natural language syntax.
- [tcn](tcn): Self-supervised representation learning from multi-view video.
- [textsum](textsum): sequence-to-sequence with attention model for text - [textsum](textsum): sequence-to-sequence with attention model for text
summarization. summarization.
- [transformer](transformer): spatial transformer network, which allows the - [transformer](transformer): spatial transformer network, which allows the
......
...@@ -29,6 +29,7 @@ Network Architecture | Adversarial training | Checkpoint ...@@ -29,6 +29,7 @@ Network Architecture | Adversarial training | Checkpoint
Inception v3 | Step L.L. | [adv_inception_v3_2017_08_18.tar.gz](http://download.tensorflow.org/models/adv_inception_v3_2017_08_18.tar.gz) Inception v3 | Step L.L. | [adv_inception_v3_2017_08_18.tar.gz](http://download.tensorflow.org/models/adv_inception_v3_2017_08_18.tar.gz)
Inception v3 | Step L.L. on ensemble of 3 models | [ens3_adv_inception_v3_2017_08_18.tar.gz](http://download.tensorflow.org/models/ens3_adv_inception_v3_2017_08_18.tar.gz) Inception v3 | Step L.L. on ensemble of 3 models | [ens3_adv_inception_v3_2017_08_18.tar.gz](http://download.tensorflow.org/models/ens3_adv_inception_v3_2017_08_18.tar.gz)
Inception v3 | Step L.L. on ensemble of 4 models| [ens4_adv_inception_v3_2017_08_18.tar.gz](http://download.tensorflow.org/models/ens4_adv_inception_v3_2017_08_18.tar.gz) Inception v3 | Step L.L. on ensemble of 4 models| [ens4_adv_inception_v3_2017_08_18.tar.gz](http://download.tensorflow.org/models/ens4_adv_inception_v3_2017_08_18.tar.gz)
Inception ResNet v2 | Step L.L. | [adv_inception_resnet_v2_2017_12_18.tar.gz](http://download.tensorflow.org/models/adv_inception_resnet_v2_2017_12_18.tar.gz)
Inception ResNet v2 | Step L.L. on ensemble of 3 models | [ens_adv_inception_resnet_v2_2017_08_18.tar.gz](http://download.tensorflow.org/models/ens_adv_inception_resnet_v2_2017_08_18.tar.gz) Inception ResNet v2 | Step L.L. on ensemble of 3 models | [ens_adv_inception_resnet_v2_2017_08_18.tar.gz](http://download.tensorflow.org/models/ens_adv_inception_resnet_v2_2017_08_18.tar.gz)
All checkpoints are compatible with All checkpoints are compatible with
......
# Coming Soon!
This directory will soon be populated with TensorFlow models and data
processing code for identifying exoplanets in astrophysical light curves.
For full details, see the following paper:
*Identifying Exoplanets With Deep Learning: A Five Planet Resonant Chain Around
Kepler-80 And An Eighth Planet Around Kepler-90*
Christopher J Shallue and Andrew Vanderburg
To appear in the Astronomical Journal
Preprint available at https://www.cfa.harvard.edu/~avanderb/kepler90i.pdf
Contact: Chris Shallue (@cshallue)
"""A script to run inference on a set of image files. """A script to run inference on a set of image files.
NOTE #1: The Attention OCR model was trained only using FSNS train dataset and NOTE #1: The Attention OCR model was trained only using FSNS train dataset and
it will work only for images which look more or less similar to french street it will work only for images which look more or less similar to french street
names. In order to apply it to images from a different distribution you need names. In order to apply it to images from a different distribution you need
to retrain (or at least fine-tune) it using images from that distribution. to retrain (or at least fine-tune) it using images from that distribution.
NOTE #2: This script exists for demo purposes only. It is highly recommended NOTE #2: This script exists for demo purposes only. It is highly recommended
to use tools and mechanisms provided by the TensorFlow Serving system to run to use tools and mechanisms provided by the TensorFlow Serving system to run
...@@ -20,10 +20,11 @@ import PIL.Image ...@@ -20,10 +20,11 @@ import PIL.Image
import tensorflow as tf import tensorflow as tf
from tensorflow.python.platform import flags from tensorflow.python.platform import flags
from tensorflow.python.training import monitored_session
import common_flags import common_flags
import datasets import datasets
import model as attention_ocr import data_provider
FLAGS = flags.FLAGS FLAGS = flags.FLAGS
common_flags.define() common_flags.define()
...@@ -44,7 +45,7 @@ def get_dataset_image_size(dataset_name): ...@@ -44,7 +45,7 @@ def get_dataset_image_size(dataset_name):
def load_images(file_pattern, batch_size, dataset_name): def load_images(file_pattern, batch_size, dataset_name):
width, height = get_dataset_image_size(dataset_name) width, height = get_dataset_image_size(dataset_name)
images_actual_data = np.ndarray(shape=(batch_size, height, width, 3), images_actual_data = np.ndarray(shape=(batch_size, height, width, 3),
dtype='float32') dtype='uint8')
for i in range(batch_size): for i in range(batch_size):
path = file_pattern % i path = file_pattern % i
print("Reading %s" % path) print("Reading %s" % path)
...@@ -53,34 +54,40 @@ def load_images(file_pattern, batch_size, dataset_name): ...@@ -53,34 +54,40 @@ def load_images(file_pattern, batch_size, dataset_name):
return images_actual_data return images_actual_data
def load_model(checkpoint, batch_size, dataset_name): def create_model(batch_size, dataset_name):
width, height = get_dataset_image_size(dataset_name) width, height = get_dataset_image_size(dataset_name)
dataset = common_flags.create_dataset(split_name=FLAGS.split_name) dataset = common_flags.create_dataset(split_name=FLAGS.split_name)
model = common_flags.create_model( model = common_flags.create_model(
num_char_classes=dataset.num_char_classes, num_char_classes=dataset.num_char_classes,
seq_length=dataset.max_sequence_length, seq_length=dataset.max_sequence_length,
num_views=dataset.num_of_views, num_views=dataset.num_of_views,
null_code=dataset.null_code, null_code=dataset.null_code,
charset=dataset.charset) charset=dataset.charset)
images_placeholder = tf.placeholder(tf.float32, raw_images = tf.placeholder(tf.uint8, shape=[batch_size, height, width, 3])
shape=[batch_size, height, width, 3]) images = tf.map_fn(data_provider.preprocess_image, raw_images,
endpoints = model.create_base(images_placeholder, labels_one_hot=None) dtype=tf.float32)
init_fn = model.create_init_fn_to_restore(checkpoint) endpoints = model.create_base(images, labels_one_hot=None)
return images_placeholder, endpoints, init_fn return raw_images, endpoints
def run(checkpoint, batch_size, dataset_name, image_path_pattern):
images_placeholder, endpoints = create_model(batch_size,
dataset_name)
images_data = load_images(image_path_pattern, batch_size,
dataset_name)
session_creator = monitored_session.ChiefSessionCreator(
checkpoint_filename_with_path=checkpoint)
with monitored_session.MonitoredSession(
session_creator=session_creator) as sess:
predictions = sess.run(endpoints.predicted_text,
feed_dict={images_placeholder: images_data})
return predictions.tolist()
def main(_): def main(_):
images_placeholder, endpoints, init_fn = load_model(FLAGS.checkpoint,
FLAGS.batch_size,
FLAGS.dataset_name)
images_data = load_images(FLAGS.image_path_pattern, FLAGS.batch_size,
FLAGS.dataset_name)
with tf.Session() as sess:
tf.tables_initializer().run() # required by the CharsetMapper
init_fn(sess)
predictions = sess.run(endpoints.predicted_text,
feed_dict={images_placeholder: images_data})
print("Predicted strings:") print("Predicted strings:")
predictions = run(FLAGS.checkpoint, FLAGS.batch_size, FLAGS.dataset_name,
FLAGS.image_path_pattern)
for line in predictions: for line in predictions:
print(line) print(line)
......
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import demo_inference
import tensorflow as tf
from tensorflow.python.training import monitored_session
_CHECKPOINT = 'model.ckpt-399731'
_CHECKPOINT_URL = 'http://download.tensorflow.org/models/attention_ocr_2017_08_09.tar.gz'
class DemoInferenceTest(tf.test.TestCase):
def setUp(self):
super(DemoInferenceTest, self).setUp()
for suffix in ['.meta', '.index', '.data-00000-of-00001']:
filename = _CHECKPOINT + suffix
self.assertTrue(tf.gfile.Exists(filename),
msg='Missing checkpoint file %s. '
'Please download and extract it from %s' %
(filename, _CHECKPOINT_URL))
self._batch_size = 32
def test_moving_variables_properly_loaded_from_a_checkpoint(self):
batch_size = 32
dataset_name = 'fsns'
images_placeholder, endpoints = demo_inference.create_model(batch_size,
dataset_name)
image_path_pattern = 'testdata/fsns_train_%02d.png'
images_data = demo_inference.load_images(image_path_pattern, batch_size,
dataset_name)
tensor_name = 'AttentionOcr_v1/conv_tower_fn/INCE/InceptionV3/Conv2d_2a_3x3/BatchNorm/moving_mean'
moving_mean_tf = tf.get_default_graph().get_tensor_by_name(
tensor_name + ':0')
reader = tf.train.NewCheckpointReader(_CHECKPOINT)
moving_mean_expected = reader.get_tensor(tensor_name)
session_creator = monitored_session.ChiefSessionCreator(
checkpoint_filename_with_path=_CHECKPOINT)
with monitored_session.MonitoredSession(
session_creator=session_creator) as sess:
moving_mean_np = sess.run(moving_mean_tf,
feed_dict={images_placeholder: images_data})
self.assertAllEqual(moving_mean_expected, moving_mean_np)
def test_correct_results_on_test_data(self):
image_path_pattern = 'testdata/fsns_train_%02d.png'
predictions = demo_inference.run(_CHECKPOINT, self._batch_size,
'fsns',
image_path_pattern)
self.assertEqual([
'Boulevard de Lunel░░░░░░░░░░░░░░░░░░░',
'Rue de Provence░░░░░░░░░░░░░░░░░░░░░░',
'Rue de Port Maria░░░░░░░░░░░░░░░░░░░░',
'Avenue Charles Gounod░░░░░░░░░░░░░░░░',
'Rue de l‘Aurore░░░░░░░░░░░░░░░░░░░░░░',
'Rue de Beuzeville░░░░░░░░░░░░░░░░░░░░',
'Rue d‘Orbey░░░░░░░░░░░░░░░░░░░░░░░░░░',
'Rue Victor Schoulcher░░░░░░░░░░░░░░░░',
'Rue de la Gare░░░░░░░░░░░░░░░░░░░░░░░',
'Rue des Tulipes░░░░░░░░░░░░░░░░░░░░░░',
'Rue André Maginot░░░░░░░░░░░░░░░░░░░░',
'Route de Pringy░░░░░░░░░░░░░░░░░░░░░░',
'Rue des Landelles░░░░░░░░░░░░░░░░░░░░',
'Rue des Ilettes░░░░░░░░░░░░░░░░░░░░░░',
'Avenue de Maurin░░░░░░░░░░░░░░░░░░░░░',
'Rue Théresa░░░░░░░░░░░░░░░░░░░░░░░░░░', # GT='Rue Thérésa'
'Route de la Balme░░░░░░░░░░░░░░░░░░░░',
'Rue Hélène Roederer░░░░░░░░░░░░░░░░░░',
'Rue Emile Bernard░░░░░░░░░░░░░░░░░░░░',
'Place de la Mairie░░░░░░░░░░░░░░░░░░░',
'Rue des Perrots░░░░░░░░░░░░░░░░░░░░░░',
'Rue de la Libération░░░░░░░░░░░░░░░░░',
'Impasse du Capcir░░░░░░░░░░░░░░░░░░░░',
'Avenue de la Grand Mare░░░░░░░░░░░░░░',
'Rue Pierre Brossolette░░░░░░░░░░░░░░░',
'Rue de Provence░░░░░░░░░░░░░░░░░░░░░░',
'Rue du Docteur Mourre░░░░░░░░░░░░░░░░',
'Rue d‘Ortheuil░░░░░░░░░░░░░░░░░░░░░░░',
'Rue des Sarments░░░░░░░░░░░░░░░░░░░░░',
'Rue du Centre░░░░░░░░░░░░░░░░░░░░░░░░',
'Impasse Pierre Mourgues░░░░░░░░░░░░░░',
'Rue Marcel Dassault░░░░░░░░░░░░░░░░░░'
], predictions)
if __name__ == '__main__':
tf.test.main()
...@@ -85,7 +85,7 @@ class CharsetMapper(object): ...@@ -85,7 +85,7 @@ class CharsetMapper(object):
""" """
mapping_strings = tf.constant(_dict_to_array(charset, default_character)) mapping_strings = tf.constant(_dict_to_array(charset, default_character))
self.table = tf.contrib.lookup.index_to_string_table_from_tensor( self.table = tf.contrib.lookup.index_to_string_table_from_tensor(
mapping=mapping_strings, default_value=default_character) mapping=mapping_strings, default_value=default_character)
def get_text(self, ids): def get_text(self, ids):
"""Returns a string corresponding to a sequence of character ids. """Returns a string corresponding to a sequence of character ids.
...@@ -94,7 +94,7 @@ class CharsetMapper(object): ...@@ -94,7 +94,7 @@ class CharsetMapper(object):
ids: a tensor with shape [batch_size, max_sequence_length] ids: a tensor with shape [batch_size, max_sequence_length]
""" """
return tf.reduce_join( return tf.reduce_join(
self.table.lookup(tf.to_int64(ids)), reduction_indices=1) self.table.lookup(tf.to_int64(ids)), reduction_indices=1)
def get_softmax_loss_fn(label_smoothing): def get_softmax_loss_fn(label_smoothing):
...@@ -111,12 +111,12 @@ def get_softmax_loss_fn(label_smoothing): ...@@ -111,12 +111,12 @@ def get_softmax_loss_fn(label_smoothing):
def loss_fn(labels, logits): def loss_fn(labels, logits):
return (tf.nn.softmax_cross_entropy_with_logits( return (tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=labels)) logits=logits, labels=labels))
else: else:
def loss_fn(labels, logits): def loss_fn(labels, logits):
return tf.nn.sparse_softmax_cross_entropy_with_logits( return tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=logits, labels=labels) logits=logits, labels=labels)
return loss_fn return loss_fn
...@@ -125,12 +125,12 @@ class Model(object): ...@@ -125,12 +125,12 @@ class Model(object):
"""Class to create the Attention OCR Model.""" """Class to create the Attention OCR Model."""
def __init__(self, def __init__(self,
num_char_classes, num_char_classes,
seq_length, seq_length,
num_views, num_views,
null_code, null_code,
mparams=None, mparams=None,
charset=None): charset=None):
"""Initialized model parameters. """Initialized model parameters.
Args: Args:
...@@ -151,10 +151,10 @@ class Model(object): ...@@ -151,10 +151,10 @@ class Model(object):
""" """
super(Model, self).__init__() super(Model, self).__init__()
self._params = ModelParams( self._params = ModelParams(
num_char_classes=num_char_classes, num_char_classes=num_char_classes,
seq_length=seq_length, seq_length=seq_length,
num_views=num_views, num_views=num_views,
null_code=null_code) null_code=null_code)
self._mparams = self.default_mparams() self._mparams = self.default_mparams()
if mparams: if mparams:
self._mparams.update(mparams) self._mparams.update(mparams)
...@@ -166,16 +166,16 @@ class Model(object): ...@@ -166,16 +166,16 @@ class Model(object):
ConvTowerParams(final_endpoint='Mixed_5d'), ConvTowerParams(final_endpoint='Mixed_5d'),
'sequence_logit_fn': 'sequence_logit_fn':
SequenceLogitsParams( SequenceLogitsParams(
use_attention=True, use_attention=True,
use_autoregression=True, use_autoregression=True,
num_lstm_units=256, num_lstm_units=256,
weight_decay=0.00004, weight_decay=0.00004,
lstm_state_clip_value=10.0), lstm_state_clip_value=10.0),
'sequence_loss_fn': 'sequence_loss_fn':
SequenceLossParams( SequenceLossParams(
label_smoothing=0.1, label_smoothing=0.1,
ignore_nulls=True, ignore_nulls=True,
average_across_timesteps=False), average_across_timesteps=False),
'encode_coordinates_fn': EncodeCoordinatesParams(enabled=False) 'encode_coordinates_fn': EncodeCoordinatesParams(enabled=False)
} }
...@@ -201,11 +201,11 @@ class Model(object): ...@@ -201,11 +201,11 @@ class Model(object):
with tf.variable_scope('conv_tower_fn/INCE'): with tf.variable_scope('conv_tower_fn/INCE'):
if reuse: if reuse:
tf.get_variable_scope().reuse_variables() tf.get_variable_scope().reuse_variables()
with slim.arg_scope( with slim.arg_scope(inception.inception_v3_arg_scope()):
[slim.batch_norm, slim.dropout], is_training=is_training): with slim.arg_scope([slim.batch_norm, slim.dropout],
with slim.arg_scope(inception.inception_v3_arg_scope()): is_training=is_training):
net, _ = inception.inception_v3_base( net, _ = inception.inception_v3_base(
images, final_endpoint=mparams.final_endpoint) images, final_endpoint=mparams.final_endpoint)
return net return net
def _create_lstm_inputs(self, net): def _create_lstm_inputs(self, net):
...@@ -261,7 +261,7 @@ class Model(object): ...@@ -261,7 +261,7 @@ class Model(object):
nets_for_merge.append(tf.reshape(net, xy_flat_shape)) nets_for_merge.append(tf.reshape(net, xy_flat_shape))
merged_net = tf.concat(nets_for_merge, 1) merged_net = tf.concat(nets_for_merge, 1)
net = slim.max_pool2d( net = slim.max_pool2d(
merged_net, kernel_size=[len(nets_list), 1], stride=1) merged_net, kernel_size=[len(nets_list), 1], stride=1)
net = tf.reshape(net, (batch_size, height, width, num_features)) net = tf.reshape(net, (batch_size, height, width, num_features))
return net return net
...@@ -303,7 +303,7 @@ class Model(object): ...@@ -303,7 +303,7 @@ class Model(object):
log_prob = utils.logits_to_log_prob(chars_logit) log_prob = utils.logits_to_log_prob(chars_logit)
ids = tf.to_int32(tf.argmax(log_prob, axis=2), name='predicted_chars') ids = tf.to_int32(tf.argmax(log_prob, axis=2), name='predicted_chars')
mask = tf.cast( mask = tf.cast(
slim.one_hot_encoding(ids, self._params.num_char_classes), tf.bool) slim.one_hot_encoding(ids, self._params.num_char_classes), tf.bool)
all_scores = tf.nn.softmax(chars_logit) all_scores = tf.nn.softmax(chars_logit)
selected_scores = tf.boolean_mask(all_scores, mask, name='char_scores') selected_scores = tf.boolean_mask(all_scores, mask, name='char_scores')
scores = tf.reshape(selected_scores, shape=(-1, self._params.seq_length)) scores = tf.reshape(selected_scores, shape=(-1, self._params.seq_length))
...@@ -334,10 +334,10 @@ class Model(object): ...@@ -334,10 +334,10 @@ class Model(object):
return net return net
def create_base(self, def create_base(self,
images, images,
labels_one_hot, labels_one_hot,
scope='AttentionOcr_v1', scope='AttentionOcr_v1',
reuse=None): reuse=None):
"""Creates a base part of the Model (no gradients, losses or summaries). """Creates a base part of the Model (no gradients, losses or summaries).
Args: Args:
...@@ -355,7 +355,7 @@ class Model(object): ...@@ -355,7 +355,7 @@ class Model(object):
is_training = labels_one_hot is not None is_training = labels_one_hot is not None
with tf.variable_scope(scope, reuse=reuse): with tf.variable_scope(scope, reuse=reuse):
views = tf.split( views = tf.split(
value=images, num_or_size_splits=self._params.num_views, axis=2) value=images, num_or_size_splits=self._params.num_views, axis=2)
logging.debug('Views=%d single view: %s', len(views), views[0]) logging.debug('Views=%d single view: %s', len(views), views[0])
nets = [ nets = [
...@@ -381,11 +381,11 @@ class Model(object): ...@@ -381,11 +381,11 @@ class Model(object):
else: else:
predicted_text = tf.constant([]) predicted_text = tf.constant([])
return OutputEndpoints( return OutputEndpoints(
chars_logit=chars_logit, chars_logit=chars_logit,
chars_log_prob=chars_log_prob, chars_log_prob=chars_log_prob,
predicted_chars=predicted_chars, predicted_chars=predicted_chars,
predicted_scores=predicted_scores, predicted_scores=predicted_scores,
predicted_text=predicted_text) predicted_text=predicted_text)
def create_loss(self, data, endpoints): def create_loss(self, data, endpoints):
"""Creates all losses required to train the model. """Creates all losses required to train the model.
...@@ -421,7 +421,7 @@ class Model(object): ...@@ -421,7 +421,7 @@ class Model(object):
A sensor with the same shape as the input. A sensor with the same shape as the input.
""" """
one_hot_labels = tf.one_hot( one_hot_labels = tf.one_hot(
chars_labels, depth=self._params.num_char_classes, axis=-1) chars_labels, depth=self._params.num_char_classes, axis=-1)
pos_weight = 1.0 - weight pos_weight = 1.0 - weight
neg_weight = weight / self._params.num_char_classes neg_weight = weight / self._params.num_char_classes
return one_hot_labels * pos_weight + neg_weight return one_hot_labels * pos_weight + neg_weight
...@@ -446,7 +446,7 @@ class Model(object): ...@@ -446,7 +446,7 @@ class Model(object):
with tf.variable_scope('sequence_loss_fn/SLF'): with tf.variable_scope('sequence_loss_fn/SLF'):
if mparams.label_smoothing > 0: if mparams.label_smoothing > 0:
smoothed_one_hot_labels = self.label_smoothing_regularization( smoothed_one_hot_labels = self.label_smoothing_regularization(
chars_labels, mparams.label_smoothing) chars_labels, mparams.label_smoothing)
labels_list = tf.unstack(smoothed_one_hot_labels, axis=1) labels_list = tf.unstack(smoothed_one_hot_labels, axis=1)
else: else:
# NOTE: in case of sparse softmax we are not using one-hot # NOTE: in case of sparse softmax we are not using one-hot
...@@ -459,20 +459,20 @@ class Model(object): ...@@ -459,20 +459,20 @@ class Model(object):
else: else:
# Suppose that reject character is the last in the charset. # Suppose that reject character is the last in the charset.
reject_char = tf.constant( reject_char = tf.constant(
self._params.num_char_classes - 1, self._params.num_char_classes - 1,
shape=(batch_size, seq_length), shape=(batch_size, seq_length),
dtype=tf.int64) dtype=tf.int64)
known_char = tf.not_equal(chars_labels, reject_char) known_char = tf.not_equal(chars_labels, reject_char)
weights = tf.to_float(known_char) weights = tf.to_float(known_char)
logits_list = tf.unstack(chars_logits, axis=1) logits_list = tf.unstack(chars_logits, axis=1)
weights_list = tf.unstack(weights, axis=1) weights_list = tf.unstack(weights, axis=1)
loss = tf.contrib.legacy_seq2seq.sequence_loss( loss = tf.contrib.legacy_seq2seq.sequence_loss(
logits_list, logits_list,
labels_list, labels_list,
weights_list, weights_list,
softmax_loss_function=get_softmax_loss_fn(mparams.label_smoothing), softmax_loss_function=get_softmax_loss_fn(mparams.label_smoothing),
average_across_timesteps=mparams.average_across_timesteps) average_across_timesteps=mparams.average_across_timesteps)
tf.losses.add_loss(loss) tf.losses.add_loss(loss)
return loss return loss
...@@ -507,7 +507,7 @@ class Model(object): ...@@ -507,7 +507,7 @@ class Model(object):
if is_training: if is_training:
tf.summary.image( tf.summary.image(
sname('image/orig'), data.images_orig, max_outputs=max_outputs) sname('image/orig'), data.images_orig, max_outputs=max_outputs)
for var in tf.trainable_variables(): for var in tf.trainable_variables():
tf.summary.histogram(var.op.name, var) tf.summary.histogram(var.op.name, var)
return None return None
...@@ -522,17 +522,17 @@ class Model(object): ...@@ -522,17 +522,17 @@ class Model(object):
use_metric('CharacterAccuracy', use_metric('CharacterAccuracy',
metrics.char_accuracy( metrics.char_accuracy(
endpoints.predicted_chars, endpoints.predicted_chars,
data.labels, data.labels,
streaming=True, streaming=True,
rej_char=self._params.null_code)) rej_char=self._params.null_code))
# Sequence accuracy computed by cutting sequence at the first null char # Sequence accuracy computed by cutting sequence at the first null char
use_metric('SequenceAccuracy', use_metric('SequenceAccuracy',
metrics.sequence_accuracy( metrics.sequence_accuracy(
endpoints.predicted_chars, endpoints.predicted_chars,
data.labels, data.labels,
streaming=True, streaming=True,
rej_char=self._params.null_code)) rej_char=self._params.null_code))
for name, value in names_to_values.iteritems(): for name, value in names_to_values.iteritems():
summary_name = 'eval/' + name summary_name = 'eval/' + name
...@@ -540,7 +540,7 @@ class Model(object): ...@@ -540,7 +540,7 @@ class Model(object):
return names_to_updates.values() return names_to_updates.values()
def create_init_fn_to_restore(self, master_checkpoint, def create_init_fn_to_restore(self, master_checkpoint,
inception_checkpoint=None): inception_checkpoint=None):
"""Creates an init operations to restore weights from various checkpoints. """Creates an init operations to restore weights from various checkpoints.
Args: Args:
...@@ -565,12 +565,15 @@ class Model(object): ...@@ -565,12 +565,15 @@ class Model(object):
all_assign_ops.append(assign_op) all_assign_ops.append(assign_op)
all_feed_dict.update(feed_dict) all_feed_dict.update(feed_dict)
logging.info('variables_to_restore:\n%s' % utils.variables_to_restore().keys())
logging.info('moving_average_variables:\n%s' % [v.op.name for v in tf.moving_average_variables()])
logging.info('trainable_variables:\n%s' % [v.op.name for v in tf.trainable_variables()])
if master_checkpoint: if master_checkpoint:
assign_from_checkpoint(utils.variables_to_restore(), master_checkpoint) assign_from_checkpoint(utils.variables_to_restore(), master_checkpoint)
if inception_checkpoint: if inception_checkpoint:
variables = utils.variables_to_restore( variables = utils.variables_to_restore(
'AttentionOcr_v1/conv_tower_fn/INCE', strip_scope=True) 'AttentionOcr_v1/conv_tower_fn/INCE', strip_scope=True)
assign_from_checkpoint(variables, inception_checkpoint) assign_from_checkpoint(variables, inception_checkpoint)
def init_assign_fn(sess): def init_assign_fn(sess):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment