eager_execution.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Eager Execution: Dev Summit 2018",
      "version": "0.3.2",
      "views": {},
      "default_view": {},
      "provenance": [],
      "collapsed_sections": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "metadata": {
        "id": "p-esxQ2Ah4ab",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "##### Copyright 2018 The TensorFlow Authors.\n",
        "\n",
        "Licensed under the Apache License, Version 2.0 (the \"License\");"
      ]
    },
    {
      "metadata": {
        "id": "Xqp-XvX5h7Ff",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "g7nGs4mzVUHP",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# Eager execution\n",
        "\n",
        "Note: you can run **[this notebook, live in Google Colab](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/outreach/demos/eager_execution.ipynb)** with zero setup. \n",
        "\n",
        "**TensorFlow Dev Summit, 2018.**\n",
        "\n",
        "This interactive notebook demonstrates **eager execution**, TensorFlow's imperative, NumPy-like front-end for machine learning.\n",
        "\n",
        "> ![alt text](https://lh3.googleusercontent.com/QOvy0clmg7siaVKzwmSPAjicWWNQ0OeyaB16plDjSJMf35WD3vLjF6mz4CGrhSHw60HnlZPJjkyDCBzw5XOI0oBGSewyYw=s688)\n",
        "\n",
        "**Table of Contents.**\n",
        "1. _Enabling eager execution!_\n",
        "2. _A NumPy-like library for numerical computation and machine learning. Case study: Fitting a huber regression_.\n",
        "3. _Neural networks. Case study: Training a multi-layer RNN._\n",
        "4. _Exercises: Batching; debugging._\n",
        "5. _Further reading_"
      ]
    },
    {
      "metadata": {
        "id": "ZVKfj5ttVkqz",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 1. Enabling eager execution!\n",
        "\n",
        "A single function call is all you need to enable eager execution: `tf.enable_eager_execution()`. You should invoke this function before calling into any other TensorFlow APIs --- the simplest way to satisfy this requirement is to make `tf.enable_eager_execution()` the first line of your `main` function.\n"
      ]
    },
    {
      "metadata": {
        "id": "C783D4QKVlK1",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "!pip install -q -U tf-nightly\n",
        "\n",
        "import tensorflow as tf\n",
        "\n",
        "tf.enable_eager_execution()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "trrHQBM1VnD0",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 2. A NumPy-like library for numerical computation and machine learning\n",
        "Enabling eager execution transforms TensorFlow into an **imperative** library for numerical computation, automatic differentiation, and machine learning. When executing eagerly, _TensorFlow no longer behaves like a dataflow graph engine_: Tensors are backed by NumPy arrays (goodbye, placeholders!), and TensorFlow operations execute *immediately* via Python (goodbye, sessions!)."
      ]
    },
    {
      "metadata": {
        "id": "MLUSuZuccgmF",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Numpy-like usage\n",
        "\n",
        "Tensors are backed by numpy arrays, which are accessible via their `.numpy()`\n",
        "method."
      ]
    },
    {
      "metadata": {
        "id": "lzrktlC0cPi1",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "A = tf.constant([[2.0, 0.0], [0.0, 3.0]])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "F5oDeGhYcX6c",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import numpy as np\n",
        "\n",
        "print(\"Tensors are backed by NumPy arrays, which are accessible through their \"\n",
        "      \"`.numpy()` method:\\n\", A)\n",
        "assert(type(A.numpy()) == np.ndarray)\n",
        "print(\"\\nOperations (like `tf.matmul(A, A)`) execute \"\n",
        "      \"immediately (no more Sessions!):\\n\", tf.matmul(A, A))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "SRCTcyCocvBq",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "Tensors behave similarly to NumPy arrays, but they don't behave exactly the\n",
        "same. \n",
        "\n",
        "For example, the equals operator on Tensors compares objects. Use\n",
	"`tf.equal` to compare values."
      ]
    },
    {
      "metadata": {
        "id": "OgBX6BJdcZ8w",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "print(\"\\nTensors behave like NumPy arrays: you can iterate over them and \"\n",
        "      \"supply them as inputs to most functions that expect NumPy arrays:\")\n",
        "for i, row in enumerate(A):\n",
        "  for j, entry in enumerate(row):\n",
        "    print(\"A[%d, %d]^2 == %d\" % (i, j, np.square(entry)))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "Q-o-XayRdAEi",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Variables and Gradients\n",
        "\n",
        "Create variables with `tf.contrib.eager.Variable`, and use `tf.GradientTape`\n",
        "to compute gradients with respect to them."
      ]
    },
    {
      "metadata": {
        "id": "PGAqOzqzccwd",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import tensorflow.contrib.eager as tfe\n",
        "w = tfe.Variable(3.0)\n",
        "with tf.GradientTape() as tape:\n",
        "  loss = w ** 2\n",
        "dw, = tape.gradient(loss, [w])\n",
        "print(\"\\nYou can use `tf.GradientTape` to compute the gradient of a \"\n",
        "      \"computation with respect to a list of `tf.contrib.eager.Variable`s;\\n\"\n",
        "      \"for example, `tape.gradient(loss, [w])`, where `loss` = w ** 2 and \"\n",
        "      \"`w` == 3.0, yields`\", dw,\"`.\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "gZFXrVTKdFnl",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### GPU usage\n",
        "Eager execution lets you offload computation to hardware accelerators like\n",
        "GPUs, if you have any available."
      ]
    },
    {
      "metadata": {
        "id": "ER-Hsk3RVmX9",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "both"
      },
      "cell_type": "code",
      "source": [
        "if tf.test.is_gpu_available():\n",
        "  with tf.device(tf.test.gpu_device_name()):\n",
        "    B = tf.constant([[2.0, 0.0], [0.0, 3.0]])\n",
        "    print(tf.matmul(B, B))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "JQ8kQT99VqDk",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## Fitting a Huber regression\n",
        "\n",
        "If you come from a scientific or numerical computing background, eager execution should feel natural to you. Not only does it stand on its own as an accelerator-compatible library for numerical computation, it also interoperates with popular Python packages like NumPy and Matplotlib. To demonstrate this fact, in this section, we fit and evaluate a regression using a [Huber regression](https://en.wikipedia.org/wiki/Huber_loss), writing our code in a NumPy-like way and making use of Python control flow."
      ]
    },
    {
      "metadata": {
        "id": "6dXt0WfBK9-7",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Data generation\n",
        "\n",
        "Our dataset for this example has many outliers — least-squares would be a poor choice."
      ]
    },
    {
      "metadata": {
        "id": "Il1zLdgjVslU",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "code"
      },
      "cell_type": "code",
      "source": [
        "import matplotlib.pyplot as plt\n",
        "\n",
        "def gen_regression_data(num_examples=1000, p=0.2):\n",
        "  X = tf.random_uniform(shape=(num_examples,), maxval=50)\n",
        "  w_star = tf.random_uniform(shape=(), maxval=10)\n",
        "  b_star = tf.random_uniform(shape=(), maxval=10)\n",
        "  noise = tf.random_normal(shape=(num_examples,), mean=0.0, stddev=10.0)\n",
        "  # With probability 1 - p, y := y * -1.\n",
        "  sign = 2 * np.random.binomial(1, 1 - p, size=(num_examples,)) - 1\n",
        "  # You can freely mix Tensors and NumPy arrays in your computations:\n",
        "  # `sign` is a NumPy array, but the other symbols below are Tensors.\n",
        "  Y = sign * (w_star * X + b_star + noise)  \n",
        "  return X, Y\n",
        "\n",
        "X, Y = gen_regression_data()\n",
        "plt.plot(X, Y, \"go\")  # You can plot Tensors!\n",
        "plt.title(\"Observed data\")\n",
        "plt.show()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "sYumjOrdMRFM",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Huber loss\n",
        "The Huber loss function is piecewise function that is quadratic for small inputs and linear otherwise; for that reason, using a Huber loss gives considerably less weight to outliers than least-squares does. When eager execution is enabled, we can implement the Huber function in the natural way, using **Python control flow**."
      ]
    },
    {
      "metadata": {
        "id": "anflUCeaVtK8",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "def huber_loss(y, y_hat, m=1.0):\n",
        "  # Enabling eager execution lets you use Python control flow.\n",
        "  delta = tf.abs(y - y_hat)\n",
        "  return delta ** 2 if delta <= m else m * (2 * delta - m)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "0_OALYGwM7ma",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### A simple class for regressions\n",
        "\n",
        "The next cell encapsulates a linear regression model in a Python class and defines a\n",
        "function that fits the model using a stochastic optimizer."
      ]
    },
    {
      "metadata": {
        "id": "-90due2RVuDF",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "code"
      },
      "cell_type": "code",
      "source": [
        "import time\n",
        "\n",
        "from google.colab import widgets\n",
        "import tensorflow.contrib.eager as tfe  # Needed to create tfe.Variable objects.\n",
        "\n",
        "\n",
        "class Regression(object):\n",
        "  def __init__(self, loss_fn):\n",
        "    super(Regression, self).__init__()\n",
        "    self.w = tfe.Variable(0.0)\n",
        "    self.b = tfe.Variable(0.0)\n",
        "    self.variables = [self.w, self.b]\n",
        "    self.loss_fn = loss_fn\n",
        "      \n",
        "  def predict(self, x):\n",
        "    return x * self.w + self.b\n",
        "  \n",
        "def regress(model, optimizer, dataset, epochs=5, log_every=1, num_examples=1000):\n",
        "  plot = log_every is not None\n",
        "  if plot:\n",
        "    # Colab provides several widgets for interactive visualization.\n",
        "    tb = widgets.TabBar([str(i) for i in range(epochs) if i % log_every == 0])\n",
        "    X, Y = dataset.batch(num_examples).make_one_shot_iterator().get_next()\n",
        "    X = tf.reshape(X, (num_examples,))\n",
        "    Y = tf.reshape(Y, (num_examples,))\n",
        "    \n",
        "  for epoch in range(epochs):\n",
        "    iterator = dataset.make_one_shot_iterator()\n",
        "    epoch_loss = 0.0\n",
        "    start = time.time()\n",
        "    for x_i, y_i in iterator:\n",
        "      batch_loss_fn = lambda: model.loss_fn(y_i, model.predict(x_i))  \n",
        "      optimizer.minimize(batch_loss_fn, var_list=model.variables)\n",
        "      epoch_loss += batch_loss_fn()\n",
        "    duration = time.time() - start\n",
        "    if plot and epoch % log_every == 0:\n",
        "      with tb.output_to(str(epoch)):\n",
        "        print(\"Epoch %d took %0.2f seconds, resulting in a loss of %0.4f.\" % (\n",
        "            epoch, duration, epoch_loss))\n",
        "        plt.plot(X, Y, \"go\", label=\"data\")\n",
        "        plt.plot(X, model.predict(X), \"b\", label=\"regression\")\n",
        "        plt.legend()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "Z8WdS6LQNc5K",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "Run the following cell to fit the model! Note that enabling eager execution makes it\n",
        "easy to visualize your model while training it, using  familiar tools like Matplotlib."
      ]
    },
    {
      "metadata": {
        "id": "_qRc30945Z3p",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "huber_regression = Regression(huber_loss)\n",
        "dataset = tf.data.Dataset.from_tensor_slices((X, Y))\n",
        "regress(huber_regression,\n",
        "        optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0001),\n",
        "        dataset=dataset)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "5icvQghlN8Fd",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## Debugging and profiling"
      ]
    },
    {
      "metadata": {
        "id": "55qmgvjgQocz",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Enabling eager execution lets you debug your code on-the-fly; use `pdb` and print statements to your heart's content.\n",
        "\n",
        "Check out exercise 2 towards the bottom of this notebook for a hands-on look at how eager simplifies model debugging."
      ]
    },
    {
      "metadata": {
        "id": "DNHJpCyNVwA9",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import pdb\n",
        "\n",
        "def buggy_loss(y, y_hat):\n",
        "  pdb.set_trace()\n",
        "  huber_loss(y, y_hat)\n",
        "  \n",
        "print(\"Type 'exit' to stop the debugger, or 's' to step into `huber_loss` and \"\n",
        "      \"'n' to step through it.\")\n",
        "try:\n",
        "  buggy_loss(1.0, 2.0)\n",
        "except:\n",
        "  pass"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "mvI3ljk-vJ_h",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Leverage the Python profiler to dig into the relative costs of training your model.\n",
        "\n",
        "If you run the below cell, you'll see that most of the time is spent computing gradients and binary operations, which is sensible considering our loss function."
      ]
    },
    {
      "metadata": {
        "id": "ZUlywNxYsapf",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import cProfile\n",
        "import pstats\n",
        "\n",
        "huber_regression = Regression(huber_loss)\n",
        "cProfile.run(\n",
        "    \"regress(model=huber_regression, \"\n",
        "    \"optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.001), \"\n",
        "    \"dataset=dataset, log_every=None)\", \"prof\")\n",
        "pstats.Stats(\"prof\").strip_dirs().sort_stats(\"cumulative\").print_stats(10)\n",
        "print(\"Most of the time is spent during backpropagation and binary operations.\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "5AeTwwPobkaJ",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 3. Neural networks\n",
        "\n",
        "While eager execution can certainly be used as a library for numerical computation, it shines as a library for deep learning: TensorFlow provides a suite of tools for deep learning research and development, most of which are compatible with eager execution. In this section, we put some of these tools to use to build _RNNColorbot_, an RNN that takes as input names of colors and predicts their corresponding RGB tuples. "
      ]
    },
    {
      "metadata": {
        "id": "6IcmEQ-jpTMO",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## Constructing a data pipeline\n",
        "\n",
        "**[`tf.data`](https://www.tensorflow.org/api_guides/python/reading_data#_tf_data_API) is TensorFlow's canonical API for constructing input pipelines.** `tf.data` lets you easily construct multi-stage pipelines that supply data to your networks during training and inference. The following cells defines methods that download and format the data needed for RNNColorbot; the details aren't important (read them in the privacy of your own home if you so wish), but make sure to run the cells before proceeding."
      ]
    },
    {
      "metadata": {
        "id": "dcUC3Ma8bjgY",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "code"
      },
      "cell_type": "code",
      "source": [
        "import os\n",
        "import six\n",
        "from six.moves import urllib\n",
        "\n",
        "\n",
        "def parse(line):\n",
        "  \"\"\"Parse a line from the colors dataset.\"\"\"\n",
        "  # `items` is a list [color_name, r, g, b].\n",
        "  items = tf.string_split([line], \",\").values\n",
        "  rgb = tf.string_to_number(items[1:], out_type=tf.float32) / 255.\n",
        "  color_name = items[0]\n",
        "  chars = tf.one_hot(tf.decode_raw(color_name, tf.uint8), depth=256)\n",
        "  length = tf.cast(tf.shape(chars)[0], dtype=tf.int64)\n",
        "  return rgb, chars, length\n",
        "\n",
        "def load_dataset(data_dir, url, batch_size):\n",
        "  \"\"\"Loads the colors data at path into a PaddedDataset.\"\"\"\n",
        "  path = tf.keras.utils.get_file(os.path.basename(url), url, cache_dir=data_dir)\n",
        "  dataset = tf.data.TextLineDataset(path).skip(1).map(parse).shuffle(\n",
        "      buffer_size=10000).padded_batch(batch_size,\n",
        "                                      padded_shapes=([None], [None, None], []))\n",
        "  return dataset, path"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "KBPJAQPUlh5M",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "train_url = \"https://raw.githubusercontent.com/random-forests/tensorflow-workshop/master/extras/colorbot/data/train.csv\"\n",
        "test_url = \"https://raw.githubusercontent.com/random-forests/tensorflow-workshop/master/extras/colorbot/data/test.csv\"\n",
        "data_dir = \"/tmp/rnn/data\"\n",
        "\n",
        "train_data, train_path = load_dataset(data_dir, train_url, batch_size=64)\n",
        "eval_data, _ = load_dataset(data_dir, test_url, batch_size=64)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "w9ftJ4LUoVYo",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import pandas\n",
        "pandas.read_csv(train_path).head(10)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "ynzm5mfnlmS8",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "colors, one_hot_chars, lengths = tfe.Iterator(train_data).next()\n",
        "colors[:10].numpy()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "S39jq-2QoA5e",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## Defining and training a neural network"
      ]
    },
    {
      "metadata": {
        "id": "9fycJOqm8vkt",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "TensorFlow packages several APIs for creating neural networks in a modular fashion. **The canonical way to define neural networks in TensorFlow is to encapsulate your model in a class that inherits from `tf.keras.Model`**. You should think of `tf.keras.Model` as a container of **[object-oriented layers](https://www.tensorflow.org/api_docs/python/tf/layers)**, TensorFlow's building blocks for constructing neural networks (*e.g.*, `tf.layers.Dense`, `tf.layers.Conv2D`). Every `Layer` object that is set as an attribute of a `Model` is automatically tracked by the latter, letting you access `Layer`-contained variables by invoking `Model`'s `.variables()` method. Most important, **inheriting from `tf.keras.Model` makes it easy to checkpoint your model and to subsequently restore it** --- more on that later. \n",
        "\n",
        "The following cell exemplifies our high-level neural network APIs. Note that `RNNColorbot` encapsulates only the model definition and prediction generation logic. The loss, training, and evaluation functions exist outside the class definition: conceptually, the model doesn't need know how to train and benchmark itself."
      ]
    },
    {
      "metadata": {
        "id": "NlKcdvT9leQ2",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "code"
      },
      "cell_type": "code",
      "source": [
        "class RNNColorbot(tf.keras.Model):\n",
        "  \"\"\"Multi-layer RNN that predicts RGB tuples given color names.\n",
        "  \"\"\"\n",
        "\n",
        "  def __init__(self):\n",
        "    super(RNNColorbot, self).__init__()\n",
        "    self.keep_prob = 0.5\n",
        "    self.lower_cell = tf.contrib.rnn.LSTMBlockCell(256)\n",
        "    self.upper_cell = tf.contrib.rnn.LSTMBlockCell(128)\n",
        "    self.relu = tf.layers.Dense(3, activation=tf.nn.relu, name=\"relu\")\n",
        "\n",
        "  def call(self, inputs, training=False):\n",
        "    \"\"\"Generates RGB tuples from `inputs`, a tuple (`chars`, `sequence_length`).\n",
        "    \"\"\"\n",
        "    (chars, sequence_length) = inputs\n",
        "    chars = tf.transpose(chars, [1, 0, 2])  # make `chars` time-major\n",
        "    batch_size = int(chars.shape[1])\n",
        "    for cell in [self.lower_cell, self.upper_cell]:\n",
        "      outputs = []\n",
        "      state = cell.zero_state(batch_size, tf.float32)\n",
        "      for ch in chars:\n",
        "        output, state = cell(ch, state)\n",
        "        outputs.append(output)\n",
        "      chars = outputs\n",
        "      if training:\n",
        "        chars = tf.nn.dropout(chars, self.keep_prob)\n",
        "    batch_range = [i for i in range(batch_size)]\n",
        "    indices = tf.stack([sequence_length - 1, batch_range], axis=1)\n",
        "    hidden_states = tf.gather_nd(chars, indices)\n",
        "    return self.relu(hidden_states)\n",
        "\n",
        "\n",
        "def loss_fn(labels, predictions):\n",
        "  return tf.reduce_mean((predictions - labels) ** 2)\n",
        "\n",
        "def train_one_epoch(model, optimizer, train_data, log_every=10):\n",
        "  iterator = tfe.Iterator(train_data)\n",
        "  for batch,(labels, chars, sequence_length) in enumerate(iterator):\n",
        "    with tf.GradientTape() as tape:\n",
        "      predictions = model((chars, sequence_length), training=True)\n",
        "      loss = loss_fn(labels, predictions)\n",
        "    variables = model.variables\n",
        "    grad = tape.gradient(loss, variables)\n",
        "    optimizer.apply_gradients([(g, v) for g, v in zip(grad, variables)])\n",
        "    if log_every and batch % log_every == 0:\n",
        "      print(\"train/batch #%d\\tloss: %.6f\" % (batch, loss))\n",
        "    batch += 1\n",
        "           \n",
        "def test(model, eval_data):\n",
        "  total_loss = 0.0\n",
        "  iterator = eval_data.make_one_shot_iterator()\n",
        "  for labels, chars, sequence_length in tfe.Iterator(eval_data):\n",
        "    predictions = model((chars, sequence_length), training=False)\n",
        "    total_loss += loss_fn(labels, predictions)\n",
        "  print(\"eval/loss: %.6f\\n\" % total_loss)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "xG1FxnhD62N3",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "The next cell **trains** our `RNNColorbot`, **restoring and saving checkpoints** of the learned variables along the way. Thanks to checkpointing, every run of the below cell will resume training from wherever the previous run left off. For more on checkpointing, take a look at our [user guide](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/g3doc/guide.md#checkpointing-trained-variables)."
      ]
    },
    {
      "metadata": {
        "id": "W7wLw3nZsqKQ",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "model = RNNColorbot()\n",
        "optimizer = tf.train.AdamOptimizer(learning_rate=.01)\n",
        "\n",
        "# Create a `Checkpoint` for saving and restoring state; the keywords\n",
        "# supplied `Checkpoint`'s constructor are the names of the objects to be saved\n",
        "# and restored, and their corresponding values are the actual objects. Note\n",
        "# that we're saving `optimizer` in addition to `model`, since `AdamOptimizer`\n",
        "# maintains state.\n",
        "import tensorflow.contrib.eager as tfe\n",
        "checkpoint = tfe.Checkpoint(model=model, optimizer=optimizer)\n",
        "checkpoint_prefix = \"/tmp/rnn/ckpt\"\n",
        "# The next line loads the most recent checkpoint, if any.\n",
        "checkpoint.restore(tf.train.latest_checkpoint(\"/tmp/rnn\"))\n",
        "for epoch in range(4):\n",
        "  train_one_epoch(model, optimizer, train_data)\n",
        "  test(model, eval_data)\n",
        "  checkpoint.save(checkpoint_prefix)\n",
        "print(\"Colorbot is ready to generate colors!\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "1HdJk37R1xz9",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Paint me a color, Colorbot!\n",
        "\n",
        "We can interact with RNNColorbot in a natural way; no need to thread NumPy arrays into placeholders through feed dicts.\n",
        "So go ahead and ask RNNColorbot to paint you some colors. If they're not to your liking, re-run the previous cell to resume training from where we left off, and then re-run the next one for updated results."
      ]
    },
    {
      "metadata": {
        "id": "LXAYjopasyWr",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "tb = widgets.TabBar([\"RNN Colorbot\"])\n",
        "while True:\n",
        "  with tb.output_to(0):\n",
        "    try:\n",
        "      color_name = six.moves.input(\n",
        "          \"Give me a color name (or press 'enter' to exit): \")\n",
        "    except (EOFError, KeyboardInterrupt):\n",
        "      break\n",
        "  if not color_name:\n",
        "    break\n",
        "  _, chars, length = parse(color_name)\n",
        "  preds, = model((np.expand_dims(chars, 0), np.expand_dims(length, 0)),\n",
        "                 training=False)\n",
        "  clipped_preds = tuple(min(float(p), 1.0) for p in preds)\n",
        "  rgb = tuple(int(p * 255) for p in clipped_preds)\n",
        "  with tb.output_to(0):\n",
        "    tb.clear_tab()\n",
        "    print(\"Predicted RGB tuple:\", rgb)\n",
        "    plt.imshow([[clipped_preds]])\n",
        "    plt.title(color_name)\n",
        "    plt.show()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "aJopbdYiXXQM",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 4. Exercises"
      ]
    },
    {
      "metadata": {
        "id": "Nt2bZ3SNq0bl",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Exercise 1: Batching\n",
        "\n",
        "Executing operations eagerly incurs small overheads; these overheads become neglible when amortized over batched operations. In this exercise, we explore the relationship between batching and performance by revisiting our Huber regression example."
      ]
    },
    {
      "metadata": {
        "id": "U5NR8vOY-4Xx",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "# Our original implementation of `huber_loss` is not compatible with non-scalar\n",
        "# data. Your task is to fix that. For your convenience, the original\n",
        "# implementation is reproduced below.\n",
        "#\n",
        "#   def huber_loss(y, y_hat, m=1.0):\n",
        "#     delta = tf.abs(y - y_hat)\n",
        "#     return delta ** 2 if delta <= m else m * (2 * delta - m)\n",
        "#\n",
        "def batched_huber_loss(y, y_hat, m=1.0):\n",
        "  # TODO: Uncomment out the below code and replace `...` with your solution.\n",
        "  # Hint: Tensors are immutable.\n",
        "  # Hint: `tf.where` might be useful.\n",
        "  delta = tf.abs(y - y_hat)\n",
        "  # ...\n",
        "  # ...\n",
        "  # return ...\n",
        "  \n",
        "regression = Regression(batched_huber_loss)\n",
        "\n",
        "num_epochs = 4\n",
        "batch_sizes = [1, 10, 20, 100, 200, 500, 1000]\n",
        "times = []\n",
        "\n",
        "X, Y = gen_regression_data(num_examples=1000)\n",
        "dataset = tf.data.Dataset.from_tensor_slices((X, Y))\n",
        "optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0001)\n",
        "for size in batch_sizes:\n",
        "  batched_dataset = dataset.batch(size)\n",
        "  start = time.time()\n",
        "  regress(model=regression, optimizer=optimizer, dataset=batched_dataset,\n",
        "          epochs=num_epochs, log_every=None)\n",
        "  end = time.time()\n",
        "  times.append((end - start) / num_epochs)\n",
        "  regression.w.assign(0.0)\n",
        "  regression.b.assign(0.0)\n",
        "  \n",
        "plt.figure()\n",
        "plt.plot(batch_sizes, times, \"bo\")\n",
        "plt.xlabel(\"batch size\")\n",
        "plt.ylabel(\"time (seconds)\")\n",
        "plt.semilogx()\n",
        "plt.semilogy()\n",
        "plt.title(\"Time per Epoch vs. Batch Size\")\n",
        "plt.show()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "-aH9GM4G-c56",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "#### Solution"
      ]
    },
    {
      "metadata": {
        "id": "MqqhJplCBxNC",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "def batched_huber_loss(y, y_hat, m=1.0):\n",
        "  delta = tf.abs(y - y_hat)\n",
        "  quadratic = delta ** 2\n",
        "  linear =  m * (2 * delta - m)\n",
        "  return tf.reduce_mean(tf.where(delta <= m, quadratic, linear))\n",
        "  \n",
        "regression = Regression(batched_huber_loss)\n",
        "\n",
        "num_epochs = 4\n",
        "batch_sizes = [2, 10, 20, 100, 200, 500, 1000]\n",
        "times = []\n",
        "\n",
        "X, Y = gen_regression_data(num_examples=1000)\n",
        "dataset = tf.data.Dataset.from_tensor_slices((X, Y))\n",
        "optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0001)\n",
        "for size in batch_sizes:\n",
        "  batched_dataset = dataset.batch(size)\n",
        "  start = time.time()\n",
        "  regress(model=regression, optimizer=optimizer, dataset=batched_dataset,\n",
        "          epochs=num_epochs, log_every=None)\n",
        "  end = time.time()\n",
        "  times.append((end - start) / num_epochs)\n",
        "  regression.w.assign(0.0)\n",
        "  regression.b.assign(0.0)\n",
        "  \n",
        "plt.figure()\n",
        "plt.plot(batch_sizes, times, \"bo\")\n",
        "plt.xlabel(\"batch size\")\n",
        "plt.ylabel(\"time (seconds)\")\n",
        "plt.semilogx()\n",
        "plt.semilogy()\n",
        "plt.title(\"Time per Epoch vs. Batch Size\")\n",
        "plt.show()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "YbL8CZNp-pvH",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "### Exercise 2: Model Debugging\n",
        "\n",
        "We've heard you loud and clear: TensorFlow programs that construct and execute graphs are difficult to debug. By design, enabling eager execution vastly simplifies the process of debugging TensorFlow programs. Once eager execution is enabled, you can step through your models using `pdb` and bisect them with `print` statements. The best way to understand the extent to which eager execution simplifies debugging is to debug a model yourself. `BuggyModel` below has two bugs lurking in it. Execute the following cell, read the error message, and go hunt some bugs!\n",
        "\n",
        "*Hint: As is often the case with TensorFlow programs, both bugs are related to the shapes of Tensors.*\n",
        "\n",
        "*Hint: You might find `tf.layers.flatten` useful.*"
      ]
    },
    {
      "metadata": {
        "id": "Aa9HIamW-m3t",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "class BuggyModel(tf.keras.Model):\n",
        "   def __init__(self):\n",
        "    super(BuggyModel, self).__init__()\n",
        "    self._input_shape = [-1, 28, 28, 1]\n",
        "    self.conv = tf.layers.Conv2D(filters=32, kernel_size=5, padding=\"same\",\n",
        "                                 data_format=\"channels_last\")\n",
        "    self.fc = tf.layers.Dense(10)\n",
        "    self.max_pool2d = tf.layers.MaxPooling2D(\n",
        "        (2, 2), (2, 2), padding=\"same\", data_format=\"channels_last\")\n",
        "    \n",
        "  def call(self, inputs):\n",
        "    y = inputs\n",
        "    y = self.conv(y)\n",
        "    y = self.max_pool2d(y)\n",
        "    return self.fc(y)\n",
        "  \n",
        "buggy_model = BuggyModel()\n",
        "inputs = tf.random_normal(shape=(100, 28, 28))\n",
        "outputs = buggy_model(inputs)\n",
        "assert outputs.shape == (100, 10), \"invalid output shape: %s\" % outputs.shape"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "ja8aFOnYsKez",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "#### Solution"
      ]
    },
    {
      "metadata": {
        "id": "J7z8JbrRltzV",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "code"
      },
      "cell_type": "code",
      "source": [
        "class BuggyModel(tf.keras.Model):\n",
        "  def __init__(self):\n",
        "    super(BuggyModel, self).__init__()\n",
        "    self._input_shape = [-1, 28, 28, 1]\n",
        "    self.conv = tf.layers.Conv2D(filters=32, kernel_size=5, padding=\"same\",\n",
        "                                 data_format=\"channels_last\")\n",
        "    self.fc = tf.layers.Dense(10)\n",
        "    self.max_pool2d = tf.layers.MaxPooling2D(\n",
        "        (2, 2), (2, 2), padding=\"same\", data_format=\"channels_last\")\n",
        "    \n",
        "  def call(self, inputs):\n",
        "    y = tf.reshape(inputs, self._input_shape)\n",
        "    y = self.conv(y)\n",
        "    y = self.max_pool2d(y)\n",
        "    y = tf.layers.flatten(y)\n",
        "    return self.fc(y)\n",
        "  \n",
        "buggy_model = BuggyModel()\n",
        "inputs = tf.random_normal(shape=(100, 28, 28))\n",
        "outputs = buggy_model(inputs)\n",
        "assert outputs.shape == (100, 10), \"invalid output shape: %s\" % outputs.shape"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "G-Ubr-Gfturc",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 5. Further reading\n",
        "\n",
        "If you'd like to learn more about eager execution, consider reading ...\n",
        "\n",
        "\n",
        "\n",
        "*   our [user guide](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/g3doc/guide.md);\n",
        "*   our [collection of example models](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples), which includes a convolutional model for [MNIST](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/mnist) classification, a [GAN](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/gan), a [recursive neural network](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/spinn), and more;\n",
        "*  [this advanced notebook](https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/autograph/examples/notebooks/dev_summit_2018_demo.ipynb), which explains how to build and execute graphs while eager execution is enabled and how to call into eager execution while constructing a graph, and which also introduces Autograph, a source-code translation tool that automatically generates graph-construction code from dynamic eager code.\n",
        "\n",
        "\n"
      ]
    }
  ]
}