AttentionOCR partial TF2 migration (#8843)

* update readme according to template * initial upgrades * demo_inference updated, need to check tf.flags * issues with tf.image * manual checks needed for tf.contrib * more tf2 updates, need to fix tf.contrib and tf.flags

AttentionOCR partial TF2 migration (#8843)
* update readme according to template * initial upgrades * demo_inference updated, need to check tf.flags * issues with tf.image * manual checks needed for tf.contrib * more tf2 updates, need to fix tf.contrib and tf.flags
50fa3eb9 · kyscg · GitHub · f05df686 · 50fa3eb9 · 50fa3eb9
Unverified Commit 50fa3eb9 authored Jul 21, 2020 by kyscg Committed by GitHub Jul 21, 2020
20 changed files
--- a/research/attention_ocr/README.md
+++ b/research/attention_ocr/README.md
-## Attention-based Extraction of Structured Information from Street View Imagery
+# Attention-based Extraction of Structured Information from Street View Imagery
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/attention-based-extraction-of-structured/optical-character-recognition-on-fsns-test)](https://paperswithcode.com/sota/optical-character-recognition-on-fsns-test?p=attention-based-extraction-of-structured)
 [![Paper](http://img.shields.io/badge/paper-arXiv.1704.03549-B3181B.svg)](https://arxiv.org/abs/1704.03549)
@@ -7,14 +7,20 @@
 *A TensorFlow model for real-world image text extraction problems.*
 This folder contains the code needed to train a new Attention OCR model on the
-[FSNS dataset][FSNS] dataset to transcribe street names in France. You can
+[FSNS dataset][FSNS] to transcribe street names in France. You can also train the code on your own data.
-also use it to train it on your own data.
 More details can be found in our paper:
 ["Attention-based Extraction of Structured Information from Street View
 Imagery"](https://arxiv.org/abs/1704.03549)
+## Description
+* Paper presents a model based on ConvNets, RNN's and a novel attention mechanism.
+Achieves **84.2%** on FSNS beating the previous benchmark (**72.46%**). Also studies
+the speed/accuracy tradeoff that results from using CNN feature extractors of
+different depths.
 ## Contacts
 Authors
@@ -22,7 +28,18 @@ Authors
 * Zbigniew Wojna (zbigniewwojna@gmail.com)
 * Alexander Gorban (gorban@google.com)
-Maintainer: Xavier Gibert [@xavigibert](https://github.com/xavigibert)
+Maintainer
+* Xavier Gibert ([@xavigibert](https://github.com/xavigibert))
+## Table of Contents
+* [Requirements](https://github.com/tensorflow/models/blob/master/research/attention_ocr/README.md#requirements)
+* [Dataset](https://github.com/tensorflow/models/blob/master/research/attention_ocr/README.md#dataset)
+* [How to use this code](https://github.com/tensorflow/models/blob/master/research/attention_ocr/README.md#how-to-use-this-code)
+* [Using your own image data](https://github.com/tensorflow/models/blob/master/research/attention_ocr/README.md#using-your-own-image-data)
+* [How to use a pre-trained model](https://github.com/tensorflow/models/blob/master/research/attention_ocr/README.md#how-to-use-a-pre-trained-model)
+* [Disclaimer](https://github.com/tensorflow/models/blob/master/research/attention_ocr/README.md#disclaimer)
 ## Requirements
@@ -49,6 +66,42 @@ cd ..
 [TF]: https://www.tensorflow.org/install/
 [FSNS]: https://github.com/tensorflow/models/tree/master/research/street
+## Dataset
+The French Street Name Signs (FSNS) dataset is split into subsets, 
+each of which is composed of multiple files. Note that these datasets 
+are very large. The approximate sizes are:
+* Train: 512 files of 300MB each.
+* Validation: 64 files of 40MB each.
+* Test: 64 files of 50MB each.
+* The datasets download includes a directory `testdata` that contains 
+some small datasets that are big enough to test that models can 
+actually learn something.
+* Total: around 158GB
+The download paths are in the following list:
+```
+https://download.tensorflow.org/data/fsns-20160927/charset_size=134.txt
+https://download.tensorflow.org/data/fsns-20160927/test/test-00000-of-00064
+...
+https://download.tensorflow.org/data/fsns-20160927/test/test-00063-of-00064
+https://download.tensorflow.org/data/fsns-20160927/testdata/arial-32-00000-of-00001
+https://download.tensorflow.org/data/fsns-20160927/testdata/fsns-00000-of-00001
+https://download.tensorflow.org/data/fsns-20160927/testdata/mnist-sample-00000-of-00001
+https://download.tensorflow.org/data/fsns-20160927/testdata/numbers-16-00000-of-00001
+https://download.tensorflow.org/data/fsns-20160927/train/train-00000-of-00512
+...
+https://download.tensorflow.org/data/fsns-20160927/train/train-00511-of-00512
+https://download.tensorflow.org/data/fsns-20160927/validation/validation-00000-of-00064
+...
+https://download.tensorflow.org/data/fsns-20160927/validation/validation-00063-of-00064
+```
+All URLs are stored in the [research/street](https://github.com/tensorflow/models/tree/master/research/street) 
+repository in the text file `python/fsns_urls.txt`.
 ## How to use this code
 To run all unit tests:
@@ -80,7 +133,7 @@ tar xf attention_ocr_2017_08_09.tar.gz
 python train.py --checkpoint=model.ckpt-399731
 ```
-## How to use your own image data to train the model
+## Using your own image data
 You need to define a new dataset. There are two options:

--- a/research/attention_ocr/python/data_provider.py
+++ b/research/attention_ocr/python/data_provider.py
@@ -56,14 +56,14 @@ def augment_image(image):
  Returns:
    Distorted Tensor image of the same shape.
  """
-  with tf.variable_scope('AugmentImage'):
+  with tf.compat.v1.variable_scope('AugmentImage'):
    height = image.get_shape().dims[0].value
    width = image.get_shape().dims[1].value
    # Random crop cut from the street sign image, resized to the same size.
    # Assures that the crop is covers at least 0.8 area of the input image.
    bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(
-        tf.shape(image),
+        image_size=tf.shape(input=image),
        bounding_boxes=tf.zeros([0, 0, 4]),
        min_object_covered=0.8,
        aspect_ratio_range=[0.8, 1.2],
@@ -74,7 +74,7 @@ def augment_image(image):
    # Randomly chooses one of the 4 interpolation methods
    distorted_image = inception_preprocessing.apply_with_random_selector(
        distorted_image,
-        lambda x, method: tf.image.resize_images(x, [height, width], method),
+        lambda x, method: tf.image.resize(x, [height, width], method),
        num_cases=4)
    distorted_image.set_shape([height, width, 3])
@@ -99,9 +99,10 @@ def central_crop(image, crop_size):
  Returns:
    A tensor of shape [crop_height, crop_width, channels].
  """
-  with tf.variable_scope('CentralCrop'):
+  with tf.compat.v1.variable_scope('CentralCrop'):
    target_width, target_height = crop_size
-    image_height, image_width = tf.shape(image)[0], tf.shape(image)[1]
+    image_height, image_width = tf.shape(
+        input=image)[0], tf.shape(input=image)[1]
    assert_op1 = tf.Assert(
        tf.greater_equal(image_height, target_height),
        ['image_height < target_height', image_height, target_height])
@@ -129,7 +130,7 @@ def preprocess_image(image, augment=False, central_crop_size=None,
    A float32 tensor of shape [H x W x 3] with RGB values in the required
    range.
  """
-  with tf.variable_scope('PreprocessImage'):
+  with tf.compat.v1.variable_scope('PreprocessImage'):
    image = tf.image.convert_image_dtype(image, dtype=tf.float32)
    if augment or central_crop_size:
      if num_towers == 1:
@@ -182,7 +183,7 @@ def get_data(dataset,
      image_orig, augment, central_crop_size, num_towers=dataset.num_of_views)
  label_one_hot = slim.one_hot_encoding(label, dataset.num_char_classes)
-  images, images_orig, labels, labels_one_hot = (tf.train.shuffle_batch(
+  images, images_orig, labels, labels_one_hot = (tf.compat.v1.train.shuffle_batch(
      [image, image_orig, label, label_one_hot],
      batch_size=batch_size,
      num_threads=shuffle_config.num_batching_threads,

--- a/research/attention_ocr/python/datasets/fsns.py
+++ b/research/attention_ocr/python/datasets/fsns.py
@@ -72,7 +72,7 @@ def read_charset(filename, null_character=u'\u2591'):
  """
  pattern = re.compile(r'(\d+)\t(.+)')
  charset = {}
-  with tf.gfile.GFile(filename) as f:
+  with tf.io.gfile.GFile(filename) as f:
    for i, line in enumerate(f):
      m = pattern.match(line)
      if m is None:
@@ -96,9 +96,9 @@ class _NumOfViewsHandler(slim.tfexample_decoder.ItemHandler):
    self._num_of_views = num_of_views
  def tensors_to_item(self, keys_to_tensors):
-    return tf.to_int64(
+    return tf.cast(
        self._num_of_views * keys_to_tensors[self._original_width_key] /
-        keys_to_tensors[self._width_key])
+        keys_to_tensors[self._width_key], dtype=tf.int64)
 def get_split(split_name, dataset_dir=None, config=None):
@@ -133,19 +133,19 @@ def get_split(split_name, dataset_dir=None, config=None):
  zero = tf.zeros([1], dtype=tf.int64)
  keys_to_features = {
      'image/encoded':
-      tf.FixedLenFeature((), tf.string, default_value=''),
+      tf.io.FixedLenFeature((), tf.string, default_value=''),
      'image/format':
-      tf.FixedLenFeature((), tf.string, default_value='png'),
+      tf.io.FixedLenFeature((), tf.string, default_value='png'),
      'image/width':
-      tf.FixedLenFeature([1], tf.int64, default_value=zero),
+      tf.io.FixedLenFeature([1], tf.int64, default_value=zero),
      'image/orig_width':
-      tf.FixedLenFeature([1], tf.int64, default_value=zero),
+      tf.io.FixedLenFeature([1], tf.int64, default_value=zero),
      'image/class':
-      tf.FixedLenFeature([config['max_sequence_length']], tf.int64),
+      tf.io.FixedLenFeature([config['max_sequence_length']], tf.int64),
      'image/unpadded_class':
-      tf.VarLenFeature(tf.int64),
+      tf.io.VarLenFeature(tf.int64),
      'image/text':
-      tf.FixedLenFeature([1], tf.string, default_value=''),
+      tf.io.FixedLenFeature([1], tf.string, default_value=''),
  }
  items_to_handlers = {
      'image':
@@ -171,7 +171,7 @@ def get_split(split_name, dataset_dir=None, config=None):
                              config['splits'][split_name]['pattern'])
  return slim.dataset.Dataset(
      data_sources=file_pattern,
-      reader=tf.TFRecordReader,
+      reader=tf.compat.v1.TFRecordReader,
      decoder=decoder,
      num_samples=config['splits'][split_name]['size'],
      items_to_descriptions=config['items_to_descriptions'],

--- a/research/attention_ocr/python/datasets/fsns_test.py
+++ b/research/attention_ocr/python/datasets/fsns_test.py
@@ -91,7 +91,7 @@ class FsnsTest(tf.test.TestCase):
    image_tf, label_tf = provider.get(['image', 'label'])
    with self.test_session() as sess:
-      sess.run(tf.global_variables_initializer())
+      sess.run(tf.compat.v1.global_variables_initializer())
      with slim.queues.QueueRunners(sess):
        image_np, label_np = sess.run([image_tf, label_tf])

--- a/research/attention_ocr/python/datasets/testdata/fsns/download_data.py
+++ b/research/attention_ocr/python/datasets/testdata/fsns/download_data.py
@@ -10,7 +10,8 @@ KEEP_NUM_RECORDS = 5
 print('Downloading %s ...' % URL)
 urllib.request.urlretrieve(URL, DST_ORIG)
-print('Writing %d records from %s to %s ...' % (KEEP_NUM_RECORDS, DST_ORIG, DST))
+print('Writing %d records from %s to %s ...' %
+      (KEEP_NUM_RECORDS, DST_ORIG, DST))
 with tf.io.TFRecordWriter(DST) as writer:
-    for raw_record in itertools.islice(tf.python_io.tf_record_iterator(DST_ORIG), KEEP_NUM_RECORDS):
+    for raw_record in itertools.islice(tf.compat.v1.python_io.tf_record_iterator(DST_ORIG), KEEP_NUM_RECORDS):
        writer.write(raw_record)
--- a/research/attention_ocr/python/demo_inference.py
+++ b/research/attention_ocr/python/demo_inference.py
@@ -49,7 +49,7 @@ def load_images(file_pattern, batch_size, dataset_name):
  for i in range(batch_size):
    path = file_pattern % i
    print("Reading %s" % path)
-    pil_image = PIL.Image.open(tf.gfile.GFile(path, 'rb'))
+    pil_image = PIL.Image.open(tf.io.gfile.GFile(path, 'rb'))
    images_actual_data[i, ...] = np.asarray(pil_image)
  return images_actual_data
@@ -63,7 +63,8 @@ def create_model(batch_size, dataset_name):
      num_views=dataset.num_of_views,
      null_code=dataset.null_code,
      charset=dataset.charset)
-  raw_images = tf.placeholder(tf.uint8, shape=[batch_size, height, width, 3])
+  raw_images = tf.compat.v1.placeholder(
+      tf.uint8, shape=[batch_size, height, width, 3])
  images = tf.map_fn(data_provider.preprocess_image, raw_images,
                     dtype=tf.float32)
  endpoints = model.create_base(images, labels_one_hot=None)
@@ -93,4 +94,4 @@ def main(_):
 if __name__ == '__main__':
-  tf.app.run()
+  tf.compat.v1.app.run()
--- a/research/attention_ocr/python/demo_inference_test.py
+++ b/research/attention_ocr/python/demo_inference_test.py
@@ -14,12 +14,13 @@ class DemoInferenceTest(tf.test.TestCase):
    super(DemoInferenceTest, self).setUp()
    for suffix in ['.meta', '.index', '.data-00000-of-00001']:
      filename = _CHECKPOINT + suffix
-      self.assertTrue(tf.gfile.Exists(filename),
+      self.assertTrue(tf.io.gfile.exists(filename),
                      msg='Missing checkpoint file %s. '
                          'Please download and extract it from %s' %
                          (filename, _CHECKPOINT_URL))
    self._batch_size = 32
-    tf.flags.FLAGS.dataset_dir = os.path.join(os.path.dirname(__file__), 'datasets/testdata/fsns')
+    tf.flags.FLAGS.dataset_dir = os.path.join(
+        os.path.dirname(__file__), 'datasets/testdata/fsns')
  def test_moving_variables_properly_loaded_from_a_checkpoint(self):
    batch_size = 32
@@ -30,9 +31,9 @@ class DemoInferenceTest(tf.test.TestCase):
    images_data = demo_inference.load_images(image_path_pattern, batch_size,
                                             dataset_name)
    tensor_name = 'AttentionOcr_v1/conv_tower_fn/INCE/InceptionV3/Conv2d_2a_3x3/BatchNorm/moving_mean'
-    moving_mean_tf = tf.get_default_graph().get_tensor_by_name(
+    moving_mean_tf = tf.compat.v1.get_default_graph().get_tensor_by_name(
        tensor_name + ':0')
-    reader = tf.train.NewCheckpointReader(_CHECKPOINT)
+    reader = tf.compat.v1.train.NewCheckpointReader(_CHECKPOINT)
    moving_mean_expected = reader.get_tensor(tensor_name)
    session_creator = monitored_session.ChiefSessionCreator(

--- a/research/attention_ocr/python/eval.py
+++ b/research/attention_ocr/python/eval.py
@@ -45,8 +45,8 @@ flags.DEFINE_integer('number_of_steps', None,
 def main(_):
-  if not tf.gfile.Exists(FLAGS.eval_log_dir):
+  if not tf.io.gfile.exists(FLAGS.eval_log_dir):
-    tf.gfile.MakeDirs(FLAGS.eval_log_dir)
+    tf.io.gfile.makedirs(FLAGS.eval_log_dir)
  dataset = common_flags.create_dataset(split_name=FLAGS.split_name)
  model = common_flags.create_model(dataset.num_char_classes,
@@ -62,7 +62,7 @@ def main(_):
  eval_ops = model.create_summaries(
      data, endpoints, dataset.charset, is_training=False)
  slim.get_or_create_global_step()
-  session_config = tf.ConfigProto(device_count={"GPU": 0})
+  session_config = tf.compat.v1.ConfigProto(device_count={"GPU": 0})
  slim.evaluation.evaluation_loop(
      master=FLAGS.master,
      checkpoint_dir=FLAGS.train_log_dir,

--- a/research/attention_ocr/python/inception_preprocessing.py
+++ b/research/attention_ocr/python/inception_preprocessing.py
@@ -38,7 +38,7 @@ def apply_with_random_selector(x, func, num_cases):
    The result of func(x, sel), where func receives the value of the
    selector as a python integer, but sel is sampled dynamically.
  """
-  sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
+  sel = tf.random.uniform([], maxval=num_cases, dtype=tf.int32)
  # Pass the real x only to one of the func calls.
  return control_flow_ops.merge([
      func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
@@ -64,7 +64,7 @@ def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
  Raises:
    ValueError: if color_ordering not in [0, 3]
  """
-  with tf.name_scope(scope, 'distort_color', [image]):
+  with tf.compat.v1.name_scope(scope, 'distort_color', [image]):
    if fast_mode:
      if color_ordering == 0:
        image = tf.image.random_brightness(image, max_delta=32. / 255.)
@@ -131,7 +131,7 @@ def distorted_bounding_box_crop(image,
  Returns:
    A tuple, a 3-D Tensor cropped_image and the distorted bbox
  """
-  with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
+  with tf.compat.v1.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
    # Each bounding box has shape [1, num_boxes, box coords] and
    # the coordinates are ordered [ymin, xmin, ymax, xmax].
@@ -143,7 +143,7 @@ def distorted_bounding_box_crop(image,
    # bounding box. If no box is supplied, then we assume the bounding box is
    # the entire image.
    sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
-        tf.shape(image),
+        image_size=tf.shape(input=image),
        bounding_boxes=bbox,
        min_object_covered=min_object_covered,
        aspect_ratio_range=aspect_ratio_range,
@@ -188,7 +188,7 @@ def preprocess_for_train(image,
  Returns:
    3-D float Tensor of distorted image used for training with range [-1, 1].
  """
-  with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
+  with tf.compat.v1.name_scope(scope, 'distort_image', [image, height, width, bbox]):
    if bbox is None:
      bbox = tf.constant(
          [0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])
@@ -198,7 +198,7 @@ def preprocess_for_train(image,
    # the coordinates are ordered [ymin, xmin, ymax, xmax].
    image_with_box = tf.image.draw_bounding_boxes(
        tf.expand_dims(image, 0), bbox)
-    tf.summary.image('image_with_bounding_boxes', image_with_box)
+    tf.compat.v1.summary.image('image_with_bounding_boxes', image_with_box)
    distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox)
    # Restore the shape since the dynamic slice based upon the bbox_size loses
@@ -206,7 +206,7 @@ def preprocess_for_train(image,
    distorted_image.set_shape([None, None, 3])
    image_with_distorted_box = tf.image.draw_bounding_boxes(
        tf.expand_dims(image, 0), distorted_bbox)
-    tf.summary.image('images_with_distorted_bounding_box',
+    tf.compat.v1.summary.image('images_with_distorted_bounding_box',
                               image_with_distorted_box)
    # This resizing operation may distort the images because the aspect
@@ -218,10 +218,10 @@ def preprocess_for_train(image,
    num_resize_cases = 1 if fast_mode else 4
    distorted_image = apply_with_random_selector(
        distorted_image,
-        lambda x, method: tf.image.resize_images(x, [height, width], method=method),
+        lambda x, method: tf.image.resize(x, [height, width], method=method),
        num_cases=num_resize_cases)
-    tf.summary.image('cropped_resized_image',
+    tf.compat.v1.summary.image('cropped_resized_image',
                               tf.expand_dims(distorted_image, 0))
    # Randomly flip the image horizontally.
@@ -233,7 +233,7 @@ def preprocess_for_train(image,
        lambda x, ordering: distort_color(x, ordering, fast_mode),
        num_cases=4)
-    tf.summary.image('final_distorted_image',
+    tf.compat.v1.summary.image('final_distorted_image',
                               tf.expand_dims(distorted_image, 0))
    distorted_image = tf.subtract(distorted_image, 0.5)
    distorted_image = tf.multiply(distorted_image, 2.0)
@@ -265,7 +265,7 @@ def preprocess_for_eval(image,
  Returns:
    3-D float Tensor of prepared image.
  """
-  with tf.name_scope(scope, 'eval_image', [image, height, width]):
+  with tf.compat.v1.name_scope(scope, 'eval_image', [image, height, width]):
    if image.dtype != tf.float32:
      image = tf.image.convert_image_dtype(image, dtype=tf.float32)
    # Crop the central region of the image with an area containing 87.5% of
@@ -276,8 +276,8 @@ def preprocess_for_eval(image,
    if height and width:
      # Resize the image to the specified height and width.
      image = tf.expand_dims(image, 0)
-      image = tf.image.resize_bilinear(
+      image = tf.image.resize(
-          image, [height, width], align_corners=False)
+          image, [height, width], method=tf.image.ResizeMethod.BILINEAR)
      image = tf.squeeze(image, [0])
    image = tf.subtract(image, 0.5)
    image = tf.multiply(image, 2.0)

--- a/research/attention_ocr/python/metrics.py
+++ b/research/attention_ocr/python/metrics.py
@@ -34,20 +34,21 @@ def char_accuracy(predictions, targets, rej_char, streaming=False):
    a update_ops for execution and value tensor whose value on evaluation
    returns the total character accuracy.
  """
-  with tf.variable_scope('CharAccuracy'):
+  with tf.compat.v1.variable_scope('CharAccuracy'):
    predictions.get_shape().assert_is_compatible_with(targets.get_shape())
-    targets = tf.to_int32(targets)
+    targets = tf.cast(targets, dtype=tf.int32)
    const_rej_char = tf.constant(rej_char, shape=targets.get_shape())
-    weights = tf.to_float(tf.not_equal(targets, const_rej_char))
+    weights = tf.cast(tf.not_equal(targets, const_rej_char), dtype=tf.float32)
-    correct_chars = tf.to_float(tf.equal(predictions, targets))
+    correct_chars = tf.cast(tf.equal(predictions, targets), dtype=tf.float32)
-    accuracy_per_example = tf.div(
+    accuracy_per_example = tf.compat.v1.div(
-        tf.reduce_sum(tf.multiply(correct_chars, weights), 1),
+        tf.reduce_sum(input_tensor=tf.multiply(
-        tf.reduce_sum(weights, 1))
+            correct_chars, weights), axis=1),
+        tf.reduce_sum(input_tensor=weights, axis=1))
    if streaming:
      return tf.contrib.metrics.streaming_mean(accuracy_per_example)
    else:
-      return tf.reduce_mean(accuracy_per_example)
+      return tf.reduce_mean(input_tensor=accuracy_per_example)
 def sequence_accuracy(predictions, targets, rej_char, streaming=False):
@@ -66,25 +67,26 @@ def sequence_accuracy(predictions, targets, rej_char, streaming=False):
    returns the total sequence accuracy.
  """
-  with tf.variable_scope('SequenceAccuracy'):
+  with tf.compat.v1.variable_scope('SequenceAccuracy'):
    predictions.get_shape().assert_is_compatible_with(targets.get_shape())
-    targets = tf.to_int32(targets)
+    targets = tf.cast(targets, dtype=tf.int32)
    const_rej_char = tf.constant(
        rej_char, shape=targets.get_shape(), dtype=tf.int32)
    include_mask = tf.not_equal(targets, const_rej_char)
-    include_predictions = tf.to_int32(
+    include_predictions = tf.cast(
-        tf.where(include_mask, predictions,
+        tf.compat.v1.where(include_mask, predictions,
-                 tf.zeros_like(predictions) + rej_char))
+                           tf.zeros_like(predictions) + rej_char), dtype=tf.int32)
-    correct_chars = tf.to_float(tf.equal(include_predictions, targets))
+    correct_chars = tf.cast(
+        tf.equal(include_predictions, targets), dtype=tf.float32)
    correct_chars_counts = tf.cast(
-        tf.reduce_sum(correct_chars, reduction_indices=[1]), dtype=tf.int32)
+        tf.reduce_sum(input_tensor=correct_chars, axis=[1]), dtype=tf.int32)
    target_length = targets.get_shape().dims[1].value
    target_chars_counts = tf.constant(
        target_length, shape=correct_chars_counts.get_shape())
-    accuracy_per_example = tf.to_float(
+    accuracy_per_example = tf.cast(
-        tf.equal(correct_chars_counts, target_chars_counts))
+        tf.equal(correct_chars_counts, target_chars_counts), dtype=tf.float32)
    if streaming:
      return tf.contrib.metrics.streaming_mean(accuracy_per_example)
    else:
-      return tf.reduce_mean(accuracy_per_example)
+      return tf.reduce_mean(input_tensor=accuracy_per_example)
--- a/research/attention_ocr/python/metrics_test.py
+++ b/research/attention_ocr/python/metrics_test.py
@@ -38,8 +38,8 @@ class AccuracyTest(tf.test.TestCase):
      A session object that should be used as a context manager.
    """
    with self.cached_session() as sess:
-      sess.run(tf.global_variables_initializer())
+      sess.run(tf.compat.v1.global_variables_initializer())
-      sess.run(tf.local_variables_initializer())
+      sess.run(tf.compat.v1.local_variables_initializer())
      yield sess
  def _fake_labels(self):
@@ -55,7 +55,7 @@ class AccuracyTest(tf.test.TestCase):
    return incorrect
  def test_sequence_accuracy_identical_samples(self):
-    labels_tf = tf.convert_to_tensor(self._fake_labels())
+    labels_tf = tf.convert_to_tensor(value=self._fake_labels())
    accuracy_tf = metrics.sequence_accuracy(labels_tf, labels_tf,
                                            self.rej_char)
@@ -66,9 +66,9 @@ class AccuracyTest(tf.test.TestCase):
  def test_sequence_accuracy_one_char_difference(self):
    ground_truth_np = self._fake_labels()
-    ground_truth_tf = tf.convert_to_tensor(ground_truth_np)
+    ground_truth_tf = tf.convert_to_tensor(value=ground_truth_np)
    prediction_tf = tf.convert_to_tensor(
-        self._incorrect_copy(ground_truth_np, bad_indexes=((0, 0))))
+        value=self._incorrect_copy(ground_truth_np, bad_indexes=((0, 0))))
    accuracy_tf = metrics.sequence_accuracy(prediction_tf, ground_truth_tf,
                                            self.rej_char)
@@ -80,9 +80,9 @@ class AccuracyTest(tf.test.TestCase):
  def test_char_accuracy_one_char_difference_with_padding(self):
    ground_truth_np = self._fake_labels()
-    ground_truth_tf = tf.convert_to_tensor(ground_truth_np)
+    ground_truth_tf = tf.convert_to_tensor(value=ground_truth_np)
    prediction_tf = tf.convert_to_tensor(
-        self._incorrect_copy(ground_truth_np, bad_indexes=((0, 0))))
+        value=self._incorrect_copy(ground_truth_np, bad_indexes=((0, 0))))
    accuracy_tf = metrics.char_accuracy(prediction_tf, ground_truth_tf,
                                        self.rej_char)

--- a/research/attention_ocr/python/model.py
+++ b/research/attention_ocr/python/model.py
@@ -92,8 +92,8 @@ class CharsetMapper(object):
        Args:
          ids: a tensor with shape [batch_size, max_sequence_length]
    """
-    return tf.reduce_join(
+    return tf.strings.reduce_join(
-        self.table.lookup(tf.to_int64(ids)), reduction_indices=1)
+        inputs=self.table.lookup(tf.cast(ids, dtype=tf.int64)), axis=1)
 def get_softmax_loss_fn(label_smoothing):
@@ -110,7 +110,7 @@ def get_softmax_loss_fn(label_smoothing):
    def loss_fn(labels, logits):
      return (tf.nn.softmax_cross_entropy_with_logits(
-          logits=logits, labels=labels))
+          logits=logits, labels=tf.stop_gradient(labels)))
  else:
    def loss_fn(labels, logits):
@@ -140,7 +140,7 @@ def get_tensor_dimensions(tensor):
    raise ValueError(
        'Incompatible shape: len(tensor.get_shape().dims) != 4 (%d != 4)' %
        len(tensor.get_shape().dims))
-  batch_size = tf.shape(tensor)[0]
+  batch_size = tf.shape(input=tensor)[0]
  height = tensor.get_shape().dims[1].value
  width = tensor.get_shape().dims[2].value
  num_features = tensor.get_shape().dims[3].value
@@ -161,7 +161,7 @@ def lookup_indexed_value(indices, row_vecs):
    A tensor of shape (batch, ) formed by row_vecs[i, indices[i]].
  """
  gather_indices = tf.stack((tf.range(
-      tf.shape(row_vecs)[0], dtype=tf.int32), tf.cast(indices, tf.int32)),
+      tf.shape(input=row_vecs)[0], dtype=tf.int32), tf.cast(indices, tf.int32)),
      axis=1)
  return tf.gather_nd(row_vecs, gather_indices)
@@ -181,7 +181,7 @@ def max_char_logprob_cumsum(char_log_prob):
    so the same function can be used regardless whether use_length_predictions
    is true or false.
  """
-  max_char_log_prob = tf.reduce_max(char_log_prob, reduction_indices=2)
+  max_char_log_prob = tf.reduce_max(input_tensor=char_log_prob, axis=2)
  # For an input array [a, b, c]) tf.cumsum returns [a, a + b, a + b + c] if
  # exclusive set to False (default).
  return tf.cumsum(max_char_log_prob, axis=1, exclusive=False)
@@ -203,7 +203,7 @@ def find_length_by_null(predicted_chars, null_code):
    A [batch, ] tensor which stores the sequence length for each sample.
  """
  return tf.reduce_sum(
-      tf.cast(tf.not_equal(null_code, predicted_chars), tf.int32), axis=1)
+      input_tensor=tf.cast(tf.not_equal(null_code, predicted_chars), tf.int32), axis=1)
 def axis_pad(tensor, axis, before=0, after=0, constant_values=0.0):
@@ -248,7 +248,8 @@ def null_based_length_prediction(chars_log_prob, null_code):
    element #seq_length - is the probability of length=seq_length.
    predicted_length is a tensor with shape [batch].
  """
-  predicted_chars = tf.to_int32(tf.argmax(chars_log_prob, axis=2))
+  predicted_chars = tf.cast(
+      tf.argmax(input=chars_log_prob, axis=2), dtype=tf.int32)
  # We do right pad to support sequences with seq_length elements.
  text_log_prob = max_char_logprob_cumsum(
      axis_pad(chars_log_prob, axis=1, after=1))
@@ -334,9 +335,9 @@ class Model(object):
    """
    mparams = self._mparams['conv_tower_fn']
    logging.debug('Using final_endpoint=%s', mparams.final_endpoint)
-    with tf.variable_scope('conv_tower_fn/INCE'):
+    with tf.compat.v1.variable_scope('conv_tower_fn/INCE'):
      if reuse:
-        tf.get_variable_scope().reuse_variables()
+        tf.compat.v1.get_variable_scope().reuse_variables()
      with slim.arg_scope(inception.inception_v3_arg_scope()):
        with slim.arg_scope([slim.batch_norm, slim.dropout],
                            is_training=is_training):
@@ -372,7 +373,7 @@ class Model(object):
  def sequence_logit_fn(self, net, labels_one_hot):
    mparams = self._mparams['sequence_logit_fn']
    # TODO(gorban): remove /alias suffixes from the scopes.
-    with tf.variable_scope('sequence_logit_fn/SQLR'):
+    with tf.compat.v1.variable_scope('sequence_logit_fn/SQLR'):
      layer_class = sequence_layers.get_layer_class(mparams.use_attention,
                                                    mparams.use_autoregression)
      layer = layer_class(net, labels_one_hot, self._params, mparams)
@@ -392,7 +393,7 @@ class Model(object):
    ]
    xy_flat_shape = (batch_size, 1, height * width, num_features)
    nets_for_merge = []
-    with tf.variable_scope('max_pool_views', values=nets_list):
+    with tf.compat.v1.variable_scope('max_pool_views', values=nets_list):
      for net in nets_list:
        nets_for_merge.append(tf.reshape(net, xy_flat_shape))
      merged_net = tf.concat(nets_for_merge, 1)
@@ -413,10 +414,11 @@ class Model(object):
    Returns:
      A tensor of shape [batch_size, seq_length, features_size].
    """
-    with tf.variable_scope('pool_views_fn/STCK'):
+    with tf.compat.v1.variable_scope('pool_views_fn/STCK'):
      net = tf.concat(nets, 1)
-      batch_size = tf.shape(net)[0]
+      batch_size = tf.shape(input=net)[0]
-      image_size = net.get_shape().dims[1].value * net.get_shape().dims[2].value
+      image_size = net.get_shape().dims[1].value * \
+          net.get_shape().dims[2].value
      feature_size = net.get_shape().dims[3].value
      return tf.reshape(net, tf.stack([batch_size, image_size, feature_size]))
@@ -438,11 +440,13 @@ class Model(object):
          with shape [batch_size x seq_length].
    """
    log_prob = utils.logits_to_log_prob(chars_logit)
-    ids = tf.to_int32(tf.argmax(log_prob, axis=2), name='predicted_chars')
+    ids = tf.cast(tf.argmax(input=log_prob, axis=2),
+                  name='predicted_chars', dtype=tf.int32)
    mask = tf.cast(
        slim.one_hot_encoding(ids, self._params.num_char_classes), tf.bool)
    all_scores = tf.nn.softmax(chars_logit)
-    selected_scores = tf.boolean_mask(all_scores, mask, name='char_scores')
+    selected_scores = tf.boolean_mask(
+        tensor=all_scores, mask=mask, name='char_scores')
    scores = tf.reshape(
        selected_scores,
        shape=(-1, self._params.seq_length),
@@ -499,7 +503,7 @@ class Model(object):
    images = tf.subtract(images, 0.5)
    images = tf.multiply(images, 2.5)
-    with tf.variable_scope(scope, reuse=reuse):
+    with tf.compat.v1.variable_scope(scope, reuse=reuse):
      views = tf.split(
          value=images, num_or_size_splits=self._params.num_views, axis=2)
      logging.debug('Views=%d single view: %s', len(views), views[0])
@@ -566,7 +570,7 @@ class Model(object):
    # multiple losses including regularization losses.
    self.sequence_loss_fn(endpoints.chars_logit, data.labels)
    total_loss = slim.losses.get_total_loss()
-    tf.summary.scalar('TotalLoss', total_loss)
+    tf.compat.v1.summary.scalar('TotalLoss', total_loss)
    return total_loss
  def label_smoothing_regularization(self, chars_labels, weight=0.1):
@@ -605,7 +609,7 @@ class Model(object):
      A Tensor with shape [batch_size] - the log-perplexity for each sequence.
    """
    mparams = self._mparams['sequence_loss_fn']
-    with tf.variable_scope('sequence_loss_fn/SLF'):
+    with tf.compat.v1.variable_scope('sequence_loss_fn/SLF'):
      if mparams.label_smoothing > 0:
        smoothed_one_hot_labels = self.label_smoothing_regularization(
            chars_labels, mparams.label_smoothing)
@@ -625,7 +629,7 @@ class Model(object):
            shape=(batch_size, seq_length),
            dtype=tf.int64)
        known_char = tf.not_equal(chars_labels, reject_char)
-        weights = tf.to_float(known_char)
+        weights = tf.cast(known_char, dtype=tf.float32)
      logits_list = tf.unstack(chars_logits, axis=1)
      weights_list = tf.unstack(weights, axis=1)
@@ -635,7 +639,7 @@ class Model(object):
          weights_list,
          softmax_loss_function=get_softmax_loss_fn(mparams.label_smoothing),
          average_across_timesteps=mparams.average_across_timesteps)
-      tf.losses.add_loss(loss)
+      tf.compat.v1.losses.add_loss(loss)
      return loss
  def create_summaries(self, data, endpoints, charset, is_training):
@@ -665,13 +669,14 @@ class Model(object):
    # tf.summary.text(sname('text/pr'), pr_text)
    # gt_text = charset_mapper.get_text(data.labels[:max_outputs,:])
    # tf.summary.text(sname('text/gt'), gt_text)
-    tf.summary.image(sname('image'), data.images, max_outputs=max_outputs)
+    tf.compat.v1.summary.image(
+        sname('image'), data.images, max_outputs=max_outputs)
    if is_training:
-      tf.summary.image(
+      tf.compat.v1.summary.image(
          sname('image/orig'), data.images_orig, max_outputs=max_outputs)
-      for var in tf.trainable_variables():
+      for var in tf.compat.v1.trainable_variables():
-        tf.summary.histogram(var.op.name, var)
+        tf.compat.v1.summary.histogram(var.op.name, var)
      return None
    else:
@@ -700,7 +705,8 @@ class Model(object):
      for name, value in names_to_values.items():
        summary_name = 'eval/' + name
-        tf.summary.scalar(summary_name, tf.Print(value, [value], summary_name))
+        tf.compat.v1.summary.scalar(
+            summary_name, tf.compat.v1.Print(value, [value], summary_name))
      return list(names_to_updates.values())
  def create_init_fn_to_restore(self,
@@ -733,9 +739,9 @@ class Model(object):
    logging.info('variables_to_restore:\n%s',
                 utils.variables_to_restore().keys())
    logging.info('moving_average_variables:\n%s',
-                 [v.op.name for v in tf.moving_average_variables()])
+                 [v.op.name for v in tf.compat.v1.moving_average_variables()])
    logging.info('trainable_variables:\n%s',
-                 [v.op.name for v in tf.trainable_variables()])
+                 [v.op.name for v in tf.compat.v1.trainable_variables()])
    if master_checkpoint:
      assign_from_checkpoint(utils.variables_to_restore(), master_checkpoint)

--- a/research/attention_ocr/python/model_export.py
+++ b/research/attention_ocr/python/model_export.py
@@ -42,7 +42,8 @@ flags.DEFINE_integer(
    'image_height', None,
    'Image height used during training(or crop height if used)'
    ' If not set, the dataset default is used instead.')
-flags.DEFINE_string('work_dir', '/tmp', 'A directory to store temporary files.')
+flags.DEFINE_string('work_dir', '/tmp',
+                    'A directory to store temporary files.')
 flags.DEFINE_integer('version_number', 1, 'Version number of the model')
 flags.DEFINE_bool(
    'export_for_serving', True,
@@ -116,7 +117,7 @@ def export_model(export_dir,
  image_height = crop_image_height or dataset_image_height
  if export_for_serving:
-    images_orig = tf.placeholder(
+    images_orig = tf.compat.v1.placeholder(
        tf.string, shape=[batch_size], name='tf_example')
    images_orig_float = model_export_lib.generate_tfexample_image(
        images_orig,
@@ -126,22 +127,23 @@ def export_model(export_dir,
        name='float_images')
  else:
    images_shape = (batch_size, image_height, image_width, image_depth)
-    images_orig = tf.placeholder(
+    images_orig = tf.compat.v1.placeholder(
        tf.uint8, shape=images_shape, name='original_image')
    images_orig_float = tf.image.convert_image_dtype(
        images_orig, dtype=tf.float32, name='float_images')
  endpoints = model.create_base(images_orig_float, labels_one_hot=None)
-  sess = tf.Session()
+  sess = tf.compat.v1.Session()
-  saver = tf.train.Saver(slim.get_variables_to_restore(), sharded=True)
+  saver = tf.compat.v1.train.Saver(
+      slim.get_variables_to_restore(), sharded=True)
  saver.restore(sess, get_checkpoint_path())
-  tf.logging.info('Model restored successfully.')
+  tf.compat.v1.logging.info('Model restored successfully.')
  # Create model signature.
  if export_for_serving:
    input_tensors = {
-        tf.saved_model.signature_constants.CLASSIFY_INPUTS: images_orig
+        tf.saved_model.CLASSIFY_INPUTS: images_orig
    }
  else:
    input_tensors = {'images': images_orig}
@@ -163,21 +165,21 @@ def export_model(export_dir,
          dataset.max_sequence_length)):
    output_tensors['attention_mask_%d' % i] = t
  signature_outputs = model_export_lib.build_tensor_info(output_tensors)
-  signature_def = tf.saved_model.signature_def_utils.build_signature_def(
+  signature_def = tf.compat.v1.saved_model.signature_def_utils.build_signature_def(
      signature_inputs, signature_outputs,
-      tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME)
+      tf.saved_model.CLASSIFY_METHOD_NAME)
  # Save model.
-  builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
+  builder = tf.compat.v1.saved_model.builder.SavedModelBuilder(export_dir)
  builder.add_meta_graph_and_variables(
-      sess, [tf.saved_model.tag_constants.SERVING],
+      sess, [tf.saved_model.SERVING],
      signature_def_map={
-          tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
+          tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
              signature_def
      },
-      main_op=tf.tables_initializer(),
+      main_op=tf.compat.v1.tables_initializer(),
      strip_default_attrs=True)
  builder.save()
-  tf.logging.info('Model has been exported to %s' % export_dir)
+  tf.compat.v1.logging.info('Model has been exported to %s' % export_dir)
  return signature_def

--- a/research/attention_ocr/python/model_export_lib.py
+++ b/research/attention_ocr/python/model_export_lib.py
@@ -36,7 +36,7 @@ def normalize_image(image, original_minval, original_maxval, target_minval,
  Returns:
    image: image which is the same shape as input image.
  """
-  with tf.name_scope('NormalizeImage', values=[image]):
+  with tf.compat.v1.name_scope('NormalizeImage', values=[image]):
    original_minval = float(original_minval)
    original_maxval = float(original_maxval)
    target_minval = float(target_minval)
@@ -68,16 +68,17 @@ def generate_tfexample_image(input_example_strings,
    A tensor with shape [batch_size, height, width, channels] of type float32
    with values in the range [0..1]
  """
-  batch_size = tf.shape(input_example_strings)[0]
+  batch_size = tf.shape(input=input_example_strings)[0]
  images_shape = tf.stack(
      [batch_size, image_height, image_width, image_channels])
  tf_example_image_key = 'image/encoded'
  feature_configs = {
      tf_example_image_key:
-          tf.FixedLenFeature(
+          tf.io.FixedLenFeature(
              image_height * image_width * image_channels, dtype=tf.float32)
  }
-  feature_tensors = tf.parse_example(input_example_strings, feature_configs)
+  feature_tensors = tf.io.parse_example(
+      serialized=input_example_strings, features=feature_configs)
  float_images = tf.reshape(
      normalize_image(
          feature_tensors[tf_example_image_key],
@@ -97,11 +98,11 @@ def attention_ocr_attention_masks(num_characters):
  names = ['%s/Softmax:0' % (prefix)]
  for i in range(1, num_characters):
    names += ['%s_%d/Softmax:0' % (prefix, i)]
-  return [tf.get_default_graph().get_tensor_by_name(n) for n in names]
+  return [tf.compat.v1.get_default_graph().get_tensor_by_name(n) for n in names]
 def build_tensor_info(tensor_dict):
  return {
-      k: tf.saved_model.utils.build_tensor_info(t)
+      k: tf.compat.v1.saved_model.utils.build_tensor_info(t)
      for k, t in tensor_dict.items()
  }
--- a/research/attention_ocr/python/model_export_test.py
+++ b/research/attention_ocr/python/model_export_test.py
@@ -29,7 +29,7 @@ _CHECKPOINT_URL = (
 def _clean_up():
-  tf.gfile.DeleteRecursively(tf.test.get_temp_dir())
+  tf.io.gfile.rmtree(tf.compat.v1.test.get_temp_dir())
 def _create_tf_example_string(image):
@@ -47,7 +47,7 @@ class AttentionOcrExportTest(tf.test.TestCase):
    for suffix in ['.meta', '.index', '.data-00000-of-00001']:
      filename = _CHECKPOINT + suffix
      self.assertTrue(
-          tf.gfile.Exists(filename),
+          tf.io.gfile.exists(filename),
          msg='Missing checkpoint file %s. '
          'Please download and extract it from %s' %
          (filename, _CHECKPOINT_URL))
@@ -57,7 +57,8 @@ class AttentionOcrExportTest(tf.test.TestCase):
        os.path.dirname(__file__), 'datasets/testdata/fsns')
    tf.test.TestCase.setUp(self)
    _clean_up()
-    self.export_dir = os.path.join(tf.test.get_temp_dir(), 'exported_model')
+    self.export_dir = os.path.join(
+        tf.compat.v1.test.get_temp_dir(), 'exported_model')
    self.minimal_output_signature = {
        'predictions': 'AttentionOcr_v1/predicted_chars:0',
        'scores': 'AttentionOcr_v1/predicted_scores:0',
@@ -93,10 +94,10 @@ class AttentionOcrExportTest(tf.test.TestCase):
                              size=self.dataset.image_shape).astype('uint8'),
    }
    signature_def = graph_def.signature_def[
-        tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
+        tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
    if serving:
      input_name = signature_def.inputs[
-          tf.saved_model.signature_constants.CLASSIFY_INPUTS].name
+          tf.saved_model.CLASSIFY_INPUTS].name
      # Model for serving takes input: inputs['inputs'] = 'tf_example:0'
      feed_dict = {
          input_name: [
@@ -126,11 +127,11 @@ class AttentionOcrExportTest(tf.test.TestCase):
      export_for_serving: True if the model was exported for Serving. This
        affects how input is fed into the model.
    """
-    tf.reset_default_graph()
+    tf.compat.v1.reset_default_graph()
-    sess = tf.Session()
+    sess = tf.compat.v1.Session()
-    graph_def = tf.saved_model.loader.load(
+    graph_def = tf.compat.v1.saved_model.loader.load(
        sess=sess,
-        tags=[tf.saved_model.tag_constants.SERVING],
+        tags=[tf.saved_model.SERVING],
        export_dir=self.export_dir)
    feed_dict = self.create_input_feed(graph_def, export_for_serving)
    results = sess.run(self.minimal_output_signature, feed_dict=feed_dict)

--- a/research/attention_ocr/python/model_test.py
+++ b/research/attention_ocr/python/model_test.py
@@ -52,7 +52,7 @@ class ModelTest(tf.test.TestCase):
                              self.num_char_classes)
    self.length_logit_shape = (self.batch_size, self.seq_length + 1)
    # Placeholder knows image dimensions, but not batch size.
-    self.input_images = tf.placeholder(
+    self.input_images = tf.compat.v1.placeholder(
        tf.float32,
        shape=(None, self.image_height, self.image_width, 3),
        name='input_node')
@@ -89,8 +89,8 @@ class ModelTest(tf.test.TestCase):
    with self.test_session() as sess:
      endpoints_tf = ocr_model.create_base(
          images=self.input_images, labels_one_hot=None)
-      sess.run(tf.global_variables_initializer())
+      sess.run(tf.compat.v1.global_variables_initializer())
-      tf.tables_initializer().run()
+      tf.compat.v1.tables_initializer().run()
      endpoints = sess.run(
          endpoints_tf, feed_dict={self.input_images: self.fake_images})
@@ -127,7 +127,7 @@ class ModelTest(tf.test.TestCase):
      ocr_model = self.create_model()
      conv_tower = ocr_model.conv_tower_fn(self.input_images)
-      sess.run(tf.global_variables_initializer())
+      sess.run(tf.compat.v1.global_variables_initializer())
      conv_tower_np = sess.run(
          conv_tower, feed_dict={self.input_images: self.fake_images})
@@ -141,9 +141,9 @@ class ModelTest(tf.test.TestCase):
    ocr_model = self.create_model()
    ocr_model.create_base(images=self.input_images, labels_one_hot=None)
    with self.test_session() as sess:
-      tfprof_root = tf.profiler.profile(
+      tfprof_root = tf.compat.v1.profiler.profile(
          sess.graph,
-          options=tf.profiler.ProfileOptionBuilder
+          options=tf.compat.v1.profiler.ProfileOptionBuilder
          .trainable_variables_parameter())
      model_size_bytes = 4 * tfprof_root.total_parameters
@@ -163,9 +163,9 @@ class ModelTest(tf.test.TestCase):
    summaries = ocr_model.create_summaries(
        data, endpoints, charset, is_training=False)
    with self.test_session() as sess:
-      sess.run(tf.global_variables_initializer())
+      sess.run(tf.compat.v1.global_variables_initializer())
-      sess.run(tf.local_variables_initializer())
+      sess.run(tf.compat.v1.local_variables_initializer())
-      tf.tables_initializer().run()
+      tf.compat.v1.tables_initializer().run()
      sess.run(summaries)  # just check it is runnable
  def test_sequence_loss_function_without_label_smoothing(self):
@@ -188,7 +188,7 @@ class ModelTest(tf.test.TestCase):
    Returns:
      a list of tensors with encoded image coordinates in them.
    """
-    batch_size = tf.shape(net)[0]
+    batch_size = tf.shape(input=net)[0]
    _, h, w, _ = net.shape.as_list()
    h_loc = [
        tf.tile(
@@ -200,7 +200,8 @@ class ModelTest(tf.test.TestCase):
    h_loc = tf.concat([tf.expand_dims(t, 2) for t in h_loc], 2)
    w_loc = [
        tf.tile(
-            tf.contrib.layers.one_hot_encoding(tf.constant([i]), num_classes=w),
+            tf.contrib.layers.one_hot_encoding(
+                tf.constant([i]), num_classes=w),
            [h, 1]) for i in range(w)
    ]
    w_loc = tf.concat([tf.expand_dims(t, 2) for t in w_loc], 2)
@@ -272,8 +273,8 @@ class ModelTest(tf.test.TestCase):
      endpoints_tf = ocr_model.create_base(
          images=self.fake_images, labels_one_hot=None)
-      sess.run(tf.global_variables_initializer())
+      sess.run(tf.compat.v1.global_variables_initializer())
-      tf.tables_initializer().run()
+      tf.compat.v1.tables_initializer().run()
      endpoints = sess.run(endpoints_tf)
      self.assertEqual(endpoints.predicted_text.shape, (self.batch_size,))
@@ -289,7 +290,7 @@ class CharsetMapperTest(tf.test.TestCase):
    charset_mapper = model.CharsetMapper(charset)
    with self.test_session() as sess:
-      tf.tables_initializer().run()
+      tf.compat.v1.tables_initializer().run()
      text = sess.run(charset_mapper.get_text(ids))
    self.assertAllEqual(text, [b'hello', b'world'])

--- a/research/attention_ocr/python/sequence_layers.py
+++ b/research/attention_ocr/python/sequence_layers.py
@@ -111,12 +111,12 @@ class SequenceLayerBase(object):
    self._mparams = method_params
    self._net = net
    self._labels_one_hot = labels_one_hot
-    self._batch_size = tf.shape(net)[0]
+    self._batch_size = tf.shape(input=net)[0]
    # Initialize parameters for char logits which will be computed on the fly
    # inside an LSTM decoder.
    self._char_logits = {}
-    regularizer = slim.l2_regularizer(self._mparams.weight_decay)
+    regularizer = tf.keras.regularizers.l2(0.5 * (self._mparams.weight_decay))
    self._softmax_w = slim.model_variable(
        'softmax_w',
        [self._mparams.num_lstm_units, self._params.num_char_classes],
@@ -124,7 +124,7 @@ class SequenceLayerBase(object):
        regularizer=regularizer)
    self._softmax_b = slim.model_variable(
        'softmax_b', [self._params.num_char_classes],
-        initializer=tf.zeros_initializer(),
+        initializer=tf.compat.v1.zeros_initializer(),
        regularizer=regularizer)
  @abc.abstractmethod
@@ -203,7 +203,7 @@ class SequenceLayerBase(object):
      A tensor with shape [batch_size, num_char_classes]
    """
    if char_index not in self._char_logits:
-      self._char_logits[char_index] = tf.nn.xw_plus_b(inputs, self._softmax_w,
+      self._char_logits[char_index] = tf.compat.v1.nn.xw_plus_b(inputs, self._softmax_w,
                                                                self._softmax_b)
    return self._char_logits[char_index]
@@ -216,7 +216,7 @@ class SequenceLayerBase(object):
    Returns:
      A tensor with shape [batch_size, num_char_classes]
    """
-    prediction = tf.argmax(logit, axis=1)
+    prediction = tf.argmax(input=logit, axis=1)
    return slim.one_hot_encoding(prediction, self._params.num_char_classes)
  def get_input(self, prev, i):
@@ -244,10 +244,10 @@ class SequenceLayerBase(object):
    Returns:
      A tensor with shape [batch_size, seq_length, num_char_classes].
    """
-    with tf.variable_scope('LSTM'):
+    with tf.compat.v1.variable_scope('LSTM'):
      first_label = self.get_input(prev=None, i=0)
      decoder_inputs = [first_label] + [None] * (self._params.seq_length - 1)
-      lstm_cell = tf.contrib.rnn.LSTMCell(
+      lstm_cell = tf.compat.v1.nn.rnn_cell.LSTMCell(
          self._mparams.num_lstm_units,
          use_peepholes=False,
          cell_clip=self._mparams.lstm_state_clip_value,
@@ -259,9 +259,9 @@ class SequenceLayerBase(object):
          loop_function=self.get_input,
          cell=lstm_cell)
-    with tf.variable_scope('logits'):
+    with tf.compat.v1.variable_scope('logits'):
      logits_list = [
-          tf.expand_dims(self.char_logit(logit, i), dim=1)
+          tf.expand_dims(self.char_logit(logit, i), axis=1)
          for i, logit in enumerate(lstm_outputs)
      ]

--- a/research/attention_ocr/python/sequence_layers_test.py
+++ b/research/attention_ocr/python/sequence_layers_test.py
@@ -29,13 +29,13 @@ import sequence_layers
 def fake_net(batch_size, num_features, feature_size):
  return tf.convert_to_tensor(
-      np.random.uniform(size=(batch_size, num_features, feature_size)),
+      value=np.random.uniform(size=(batch_size, num_features, feature_size)),
      dtype=tf.float32)
 def fake_labels(batch_size, seq_length, num_char_classes):
  labels_np = tf.convert_to_tensor(
-      np.random.randint(
+      value=np.random.randint(
          low=0, high=num_char_classes, size=(batch_size, seq_length)))
  return slim.one_hot_encoding(labels_np, num_classes=num_char_classes)

--- a/research/attention_ocr/python/train.py
+++ b/research/attention_ocr/python/train.py
@@ -96,16 +96,16 @@ def get_training_hparams():
 def create_optimizer(hparams):
  """Creates optimized based on the specified flags."""
  if hparams.optimizer == 'momentum':
-    optimizer = tf.train.MomentumOptimizer(
+    optimizer = tf.compat.v1.train.MomentumOptimizer(
        hparams.learning_rate, momentum=hparams.momentum)
  elif hparams.optimizer == 'adam':
-    optimizer = tf.train.AdamOptimizer(hparams.learning_rate)
+    optimizer = tf.compat.v1.train.AdamOptimizer(hparams.learning_rate)
  elif hparams.optimizer == 'adadelta':
-    optimizer = tf.train.AdadeltaOptimizer(hparams.learning_rate)
+    optimizer = tf.compat.v1.train.AdadeltaOptimizer(hparams.learning_rate)
  elif hparams.optimizer == 'adagrad':
-    optimizer = tf.train.AdagradOptimizer(hparams.learning_rate)
+    optimizer = tf.compat.v1.train.AdagradOptimizer(hparams.learning_rate)
  elif hparams.optimizer == 'rmsprop':
-    optimizer = tf.train.RMSPropOptimizer(
+    optimizer = tf.compat.v1.train.RMSPropOptimizer(
        hparams.learning_rate, momentum=hparams.momentum)
  return optimizer
@@ -154,14 +154,14 @@ def train(loss, init_fn, hparams):
 def prepare_training_dir():
-  if not tf.gfile.Exists(FLAGS.train_log_dir):
+  if not tf.io.gfile.exists(FLAGS.train_log_dir):
    logging.info('Create a new training directory %s', FLAGS.train_log_dir)
-    tf.gfile.MakeDirs(FLAGS.train_log_dir)
+    tf.io.gfile.makedirs(FLAGS.train_log_dir)
  else:
    if FLAGS.reset_train_dir:
      logging.info('Reset the training directory %s', FLAGS.train_log_dir)
-      tf.gfile.DeleteRecursively(FLAGS.train_log_dir)
+      tf.io.gfile.rmtree(FLAGS.train_log_dir)
-      tf.gfile.MakeDirs(FLAGS.train_log_dir)
+      tf.io.gfile.makedirs(FLAGS.train_log_dir)
    else:
      logging.info('Use already existing training directory %s',
                   FLAGS.train_log_dir)
@@ -169,7 +169,7 @@ def prepare_training_dir():
 def calculate_graph_metrics():
  param_stats = model_analyzer.print_model_analysis(
-      tf.get_default_graph(),
+      tf.compat.v1.get_default_graph(),
      tfprof_options=model_analyzer.TRAINABLE_VARS_PARAMS_STAT_OPTIONS)
  return param_stats.total_parameters
@@ -186,7 +186,7 @@ def main(_):
  # If ps_tasks is zero, the local device is used. When using multiple
  # (non-local) replicas, the ReplicaDeviceSetter distributes the variables
  # across the different devices.
-  device_setter = tf.train.replica_device_setter(
+  device_setter = tf.compat.v1.train.replica_device_setter(
      FLAGS.ps_tasks, merge_devices=True)
  with tf.device(device_setter):
    data = data_provider.get_data(

--- a/research/attention_ocr/python/utils.py
+++ b/research/attention_ocr/python/utils.py
@@ -37,16 +37,16 @@ def logits_to_log_prob(logits):
    probabilities.
  """
-  with tf.variable_scope('log_probabilities'):
+  with tf.compat.v1.variable_scope('log_probabilities'):
    reduction_indices = len(logits.shape.as_list()) - 1
    max_logits = tf.reduce_max(
-        logits, reduction_indices=reduction_indices, keep_dims=True)
+        input_tensor=logits, axis=reduction_indices, keepdims=True)
    safe_logits = tf.subtract(logits, max_logits)
    sum_exp = tf.reduce_sum(
-        tf.exp(safe_logits),
+        input_tensor=tf.exp(safe_logits),
-        reduction_indices=reduction_indices,
+        axis=reduction_indices,
-        keep_dims=True)
+        keepdims=True)
-    log_probs = tf.subtract(safe_logits, tf.log(sum_exp))
+    log_probs = tf.subtract(safe_logits, tf.math.log(sum_exp))
  return log_probs
@@ -91,7 +91,7 @@ def ConvertAllInputsToTensors(func):
  """
  def FuncWrapper(*args):
-    tensors = [tf.convert_to_tensor(a) for a in args]
+    tensors = [tf.convert_to_tensor(value=a) for a in args]
    return func(*tensors)
  return FuncWrapper