Layers

bfee3a1f · Mark Daoust · 71b64ccb · bfee3a1f
Commit bfee3a1f authored Jul 17, 2018 by Mark Daoust
Hide whitespace changes
Inline Side-by-side

Showing with 350 additions and 18 deletions

samples/core/guide/autograph.ipynb samples/core/guide/autograph.ipynb +350 -18

No files found.
--- a/samples/core/guide/autograph.ipynb
+++ b/samples/core/guide/autograph.ipynb
@@ -10,8 +10,7 @@
      "collapsed_sections": [
        "Jxv6goXm7oGF"
      ],
-      "toc_visible": true,
-      "include_colab_link": true
+      "toc_visible": true
    },
    "kernelspec": {
      "name": "python3",
@@ -19,16 +18,6 @@
    }
  },
  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "view-in-github",
-        "colab_type": "text"
-      },
-      "source": [
-        "[View in Colaboratory](https://colab.research.google.com/github/MarkDaoust/models/blob/autograph/samples/core/guide/autograph.ipynb)"
-      ]
-    },
    {
      "metadata": {
        "id": "Jxv6goXm7oGF",
@@ -124,7 +113,7 @@
      },
      "cell_type": "code",
      "source": [
-        "! pip install tf-nightly"
+        "! pip install -U tf-nightly"
      ],
      "execution_count": 0,
      "outputs": []
@@ -150,8 +139,11 @@
        "from __future__ import division, print_function, absolute_import\n",
        "\n",
        "import tensorflow as tf\n",
+        "import tensorflow.keras.layers as layers\n",
        "from tensorflow.contrib import autograph\n",
        "\n",
+        "\n",
+        "import numpy as np\n",
        "import matplotlib.pyplot as plt"
      ],
      "execution_count": 0,
@@ -469,7 +461,7 @@
        "    \n",
        "with tf.Graph().as_default():\n",
        "  with tf.Session():\n",
-        "    count(tf.constant(0)).eval()"
+        "    count(tf.constant(5)).eval()"
      ],
      "execution_count": 0,
      "outputs": []
@@ -506,7 +498,6 @@
        "  # (this is just like np.stack)\n",
        "  return autograph.stack(z) \n",
        "\n",
-        "#tf_f = autograph.to_graph(f)\n",
        "\n",
        "with tf.Graph().as_default():  \n",
        "  with tf.Session():\n",
@@ -581,6 +572,42 @@
      "execution_count": 0,
      "outputs": []
    },
+    {
+      "metadata": {
+        "id": "3N1mz7sNY87N",
+        "colab_type": "text"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "### For loop"
+      ]
+    },
+    {
+      "metadata": {
+        "id": "CFk2fszrY8af",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "@autograph.convert()\n",
+        "def fizzbuzz_each(nums):\n",
+        "\n",
+        "  result = []\n",
+        "  autograph.set_element_type(result, tf.string)\n",
+        "\n",
+        "  for num in nums: \n",
+        "    result.append(fizzbuzz(num))\n",
+        "    \n",
+        "  return autograph.stack(result)\n",
+        "    \n",
+        "with tf.Graph().as_default():  \n",
+        "  with tf.Session() as sess:\n",
+        "    print(sess.run(fizzbuzz_each(tf.constant(np.arange(10)))))"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
    {
      "metadata": {
        "id": "FXB0Zbwl13PY",
@@ -619,6 +646,313 @@
      "execution_count": 0,
      "outputs": []
    },
+    {
+      "metadata": {
+        "id": "XY4UspHmZNdL",
+        "colab_type": "text"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Interoperation with `tf.Keras`\n",
+        "\n",
+        "Now that you've seen the basics, let's build some real model components with autograph.\n",
+        "\n",
+        "It's relatively simple to integrate `autograph` with `tf.keras`. But remember that batchng is essential for performance. So the best candidate code for conversion to autograph is code where the control flow is decided at the _batch_ level. If decisions are made at the individual _example_ level you will still need to index and batch your examples to maintain performance while appling the control flow logic. \n",
+        "\n",
+        "\n",
+        "### Stateless functions\n",
+        "\n",
+        "For stateless functions like `collatz`, below, the easiest way to include them in a keras model is to wrap them up as a layer uisng `tf.keras.layers.Lambda`."
+      ]
+    },
+    {
+      "metadata": {
+        "id": "ChZh3q-zcF6C",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "import numpy as np\n",
+        "\n",
+        "@autograph.convert()\n",
+        "def collatz(x):\n",
+        "  x=tf.reshape(x,())\n",
+        "  assert x>0\n",
+        "  n = tf.convert_to_tensor(0) \n",
+        "  while not tf.equal(x,1):\n",
+        "    n+=1\n",
+        "    if tf.equal(x%2, 0):\n",
+        "      x = x//2\n",
+        "    else:\n",
+        "      x = 3*x+1\n",
+        "      \n",
+        "  return n\n",
+        "\n",
+        "with tf.Graph().as_default():\n",
+        "  model = tf.keras.Sequential([\n",
+        "    tf.keras.layers.Lambda(collatz, input_shape=(1,), output_shape=(), )\n",
+        "  ])\n",
+        "  \n",
+        "result = model.predict(np.array([6171])) #261\n",
+        "result"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "id": "k9LEoa3ud9hA",
+        "colab_type": "text"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "### Custom Layers and Models\n",
+        "\n",
+        "<!--TODO(markdaoust) link to full examples  or these referenced models.-->\n",
+        "\n",
+        "The easiest way is to `@autograph.convert()` the `call` method. See the [keras guide](https://tensorflow.org/guide/keras#build_advanced_models) for details on how to build on these classes. \n",
+        "\n",
+        "Here is a simple example of the [stocastic network depth](https://arxiv.org/abs/1603.09382) technique :"
+      ]
+    },
+    {
+      "metadata": {
+        "id": "DJi_RJkeeOju",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "# `K` is used to check if we're in train or test mode.\n",
+        "import tensorflow.keras.backend as K\n",
+        "\n",
+        "class StocasticNetworkDepth(tf.keras.Sequential):\n",
+        "  def __init__(self, pfirst=1.0, plast=0.5, *args,**kwargs):\n",
+        "    self.pfirst = pfirst\n",
+        "    self.plast = plast\n",
+        "    super().__init__(*args,**kwargs)\n",
+        "        \n",
+        "  def build(self,input_shape):\n",
+        "    super().build(input_shape.as_list())\n",
+        "    self.depth = len(self.layers)\n",
+        "    self.plims = np.linspace(self.pfirst, self.plast, self.depth+1)[:-1]\n",
+        "    \n",
+        "  @autograph.convert()\n",
+        "  def call(self, inputs):\n",
+        "    training = tf.cast(K.learning_phase(), dtype=bool)  \n",
+        "    if not training: \n",
+        "      count = self.depth\n",
+        "      return super(StocasticNetworkDepth, self).call(inputs), count\n",
+        "    \n",
+        "    p = tf.random_uniform((self.depth,))\n",
+        "    \n",
+        "    keeps = p<=self.plims\n",
+        "    x = inputs\n",
+        "    \n",
+        "    count = tf.reduce_sum(tf.cast(keeps, tf.int32))\n",
+        "    for i in range(self.depth):\n",
+        "      if keeps[i]:\n",
+        "        x = self.layers[i](x)\n",
+        "      \n",
+        "    # return both the final-layer output and the number of layers executed.\n",
+        "    return x, count"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "id": "NIEzuNL6vMVl",
+        "colab_type": "text"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Let's try it on mnist-shaped data:"
+      ]
+    },
+    {
+      "metadata": {
+        "id": "FiqyFySkWbeN",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "train_batch = np.random.randn(64, 28,28,1).astype(np.float32)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "id": "Vz1JTpLOvT4u",
+        "colab_type": "text"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Build a simple stack of `conv` layers, in the stocastic depth model:"
+      ]
+    },
+    {
+      "metadata": {
+        "id": "XwwtlQAjvUph",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "with tf.Graph().as_default() as g:\n",
+        "  model = StocasticNetworkDepth(\n",
+        "        pfirst=1.0, plast=0.5)\n",
+        "\n",
+        "  for n in range(20):\n",
+        "    model.add(\n",
+        "          layers.Conv2D(filters=16, activation=tf.nn.relu,\n",
+        "                        kernel_size=(3,3), padding='same'))\n",
+        "\n",
+        "  model.build(tf.TensorShape((None, None, None,1)))\n",
+        "  \n",
+        "  init = tf.global_variables_initializer()"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "id": "uM3g_v7mvrkg",
+        "colab_type": "text"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Now test it to ensure it behaves as expected in train and test modes:"
+      ]
+    },
+    {
+      "metadata": {
+        "id": "7tdmuh5Zvm3D",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "# Use an explicit session here so we can set the train/test switch, and\n",
+        "# inspect the layer count returned by `call`\n",
+        "with tf.Session(graph=g) as sess:\n",
+        "  init.run()\n",
+        " \n",
+        "  for phase, name in enumerate(['test','train']):\n",
+        "    K.set_learning_phase(phase)\n",
+        "    result, count = model(tf.convert_to_tensor(train_batch, dtype=tf.float32))\n",
+        "\n",
+        "    result1, count1 = sess.run((result, count))\n",
+        "    result2, count2 = sess.run((result, count))\n",
+        "\n",
+        "    delta = (result1 - result2)\n",
+        "    print(name, \"sum abs delta: \", abs(delta).mean())\n",
+        "    print(\"    layers 1st call: \", count1)\n",
+        "    print(\"    layers 2nd call: \", count2)\n",
+        "    print()"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "id": "cpUD21HQWcOq",
+        "colab_type": "text"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "### RNN Cells\n",
+        "\n",
+        "The [standard approach](https://www.tensorflow.org/api_docs/python/tf/keras/layers/RNN) to custom RNN cells has the same issues that are solved by autograph.\n",
+        "\n",
+        "Implementing RNN cells with `autograph` is not much different from implementing them [under eager execution](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb).\n",
+        "\n",
+        "To implement the prediction step in a keras model you could say:\n",
+        "\n"
+      ]
+    },
+    {
+      "metadata": {
+        "id": "798S1r-sJGfR",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "class BahdanauAttention(tf.keras.Model):\n",
+        "  def __init__(self, units):\n",
+        "    super(BahdanauAttention, self).__init__()\n",
+        "    self.W1 = tf.keras.layers.Dense(units)\n",
+        "    self.W2 = tf.keras.layers.Dense(units)\n",
+        "    self.V = tf.keras.layers.Dense(1)\n",
+        "  \n",
+        "  def call(self, features, hidden):\n",
+        "    hidden_with_time_axis = tf.expand_dims(hidden, 1)\n",
+        "    score = tf.nn.tanh(self.W1(features) + self.W2(hidden_with_time_axis))\n",
+        "    attention_weights = tf.nn.softmax(self.V(score), axis=1)\n",
+        "    context_vector = attention_weights * features\n",
+        "    context_vector = tf.reduce_sum(context_vector, axis=1)\n",
+        "    return context_vector, attention_weights"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "id": "qwH-QnmlGV6c",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "class Decoder(tf.keras.Model):\n",
+        "    def __init__(self, vocab_size, embedding_dim, dec_units):\n",
+        "        super(Decoder, self).__init__()\n",
+        "        self.dec_units = dec_units\n",
+        "        self.embedding = layers.Embedding(vocab_size, embedding_dim)\n",
+        "        self.gru = layers.GRU(self.dec_units)\n",
+        "        self.fc = tf.keras.layers.Dense(vocab_size)\n",
+        "        self.attention = BahdanauAttention(self.dec_units)\n",
+        "        \n",
+        "    def call(self, enc_output):\n",
+        "        results = tf.keras\n",
+        "        hidden_with_time_axis = tf.expand_dims(hidden, 1)\n",
+        "        score = tf.nn.tanh(self.W1(enc_output) + self.W2(hidden_with_time_axis))\n",
+        "        \n",
+        "        # attention_weights shape == (batch_size, max_length, 1)\n",
+        "        # we get 1 at the last axis because we are applying score to self.V\n",
+        "        attention_weights = tf.nn.softmax(self.V(score), axis=1)\n",
+        "        \n",
+        "        # context_vector shape after sum == (batch_size, hidden_size)\n",
+        "        context_vector = attention_weights * enc_output\n",
+        "        context_vector = tf.reduce_sum(context_vector, axis=1)\n",
+        "        \n",
+        "        # x shape after passing through embedding == (batch_size, 1, embedding_dim)\n",
+        "        x = self.embedding(x)\n",
+        "        \n",
+        "        # x shape after concatenation == (batch_size, 1, embedding_dim + hidden_size)\n",
+        "        x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)\n",
+        "        \n",
+        "        # passing the concatenated vector to the GRU\n",
+        "        output, state = self.gru(x)\n",
+        "        \n",
+        "        # output shape == (batch_size * max_length, hidden_size)\n",
+        "        output = tf.reshape(output, (-1, output.shape[2]))\n",
+        "        \n",
+        "        # output shape == (batch_size * max_length, vocab)\n",
+        "        x = self.fc(output)\n",
+        "        \n",
+        "        return x, state, attention_weights\n",
+        "        \n",
+        "    def initialize_hidden_state(self):\n",
+        "        return tf.zeros((self.batch_sz, self.dec_units))"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
    {
      "metadata": {
        "id": "4LfnJjm0Bm0B",
@@ -630,9 +964,7 @@
        "\n",
        "Since writing control flow in AutoGraph is easy, running a training loop in a TensorFlow graph should also be easy.  \n",
        "\n",
-        "<!--TODO(markdaoust) link to examples showing autograph **in** keras models when ready-->\n",
-        "\n",
-        "Important: While this example wraps a `tf.keras.Model` using AutoGraph, `tf.contrib.autograph` is compatible with `tf.keras` and can be used in [Keras custom layers and models](https://tensorflow.org/guide/keras#build_advanced_models). The easiest way is to `@autograph.convert()` the `call` method.\n",
+        "Important: While this example wraps a `tf.keras.Model` using AutoGraph, `tf.contrib.autograph` is compatible with `tf.keras` and can be used in [Keras custom layers and models](https://tensorflow.org/guide/keras#build_advanced_models). \n",
        "\n",
        "This example shows how to train a simple Keras model on MNIST with the entire training process—loading batches, calculating gradients, updating parameters, calculating validation accuracy, and repeating until convergence—is performed in-graph."
      ]