Slim walkthough notebook update to show difference of inference for models...

Slim walkthough notebook update to show difference of inference for models with 1000 and 1001 classes. (#591) * Example for VGG-19 inference was added to walkthough notebook. Example shows how to use networks that was trained using 1001 and 1000 classes. * Small comment was added. * changed VGG-19 to VGG-16

Slim walkthough notebook update to show difference of inference for models...
Slim walkthough notebook update to show difference of inference for models with 1000 and 1001 classes. (#591) * Example for VGG-19 inference was added to walkthough notebook. Example shows how to use networks that was trained using 1001 and 1000 classes. * Small comment was added. * changed VGG-19 to VGG-16
37d86a95 · Daniil Pakhomov · Sergio Guadarrama · cf6b9cee · 37d86a95
Commit 37d86a95 authored Mar 02, 2017 by Daniil Pakhomov Committed by Sergio Guadarrama Mar 02, 2017
Hide whitespace changes
Inline Side-by-side

Showing with 100 additions and 4 deletions

slim/slim_walkthough.ipynb slim/slim_walkthough.ipynb +100 -4

No files found.
--- a/slim/slim_walkthough.ipynb
+++ b/slim/slim_walkthough.ipynb
@@ -753,7 +753,9 @@
    "However, this  means they must be trained on big datasets. Since this process is slow, we provide various pre-trained models - see the list [here](https://github.com/tensorflow/models/tree/master/slim#pre-trained-models).\n",
    "\n",
    "\n",
-    "You can either use these models as-is, or you can perform \"surgery\" on them, to modify them for some other task. For example, it is common to \"chop off\" the final pre-softmax layer, and replace it with a new set of weights corresponding to some new set of labels. You can then quickly fine tune the new model on a small new dataset. We illustrate this below, using inception-v1 as the base model. While models like Inception V3 are more powerful, Inception V1 is used for speed purposes.\n"
+    "You can either use these models as-is, or you can perform \"surgery\" on them, to modify them for some other task. For example, it is common to \"chop off\" the final pre-softmax layer, and replace it with a new set of weights corresponding to some new set of labels. You can then quickly fine tune the new model on a small new dataset. We illustrate this below, using inception-v1 as the base model. While models like Inception V3 are more powerful, Inception V1 is used for speed purposes.\n",
+    "\n",
+    "Take into account that VGG and ResNet final layers have only 1000 outputs rather than 1001. The ImageNet dataset provied has an empty background class which can be used to fine-tune the model to other tasks. VGG and ResNet models provided here don't use that class. We provide two examples of using pretrained models: Inception V1 and VGG-19 models to highlight this difference.\n"
   ]
  },
  {
@@ -789,7 +791,7 @@
   "metadata": {},
   "source": [
    "\n",
-    "### Apply Pre-trained model to Images.\n",
+    "### Apply Pre-trained Inception V1 model to Images.\n",
    "\n",
    "We have to convert each image to the size expected by the model checkpoint.\n",
    "There is no easy way to determine this size from the checkpoint itself.\n",
@@ -815,7 +817,6 @@
    "\n",
    "slim = tf.contrib.slim\n",
    "\n",
-    "batch_size = 3\n",
    "image_size = inception.inception_v1.default_image_size\n",
    "\n",
    "with tf.Graph().as_default():\n",
@@ -851,6 +852,101 @@
    "        print('Probability %0.2f%% => [%s]' % (probabilities[index], names[index]))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Download the VGG-16 checkpoint"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "from datasets import dataset_utils\n",
+    "import tensorflow as tf\n",
+    "\n",
+    "url = \"http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz\"\n",
+    "checkpoints_dir = '/tmp/checkpoints'\n",
+    "\n",
+    "if not tf.gfile.Exists(checkpoints_dir):\n",
+    "    tf.gfile.MakeDirs(checkpoints_dir)\n",
+    "\n",
+    "dataset_utils.download_and_uncompress_tarball(url, checkpoints_dir)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "### Apply Pre-trained VGG-16 model to Images.\n",
+    "\n",
+    "We have to convert each image to the size expected by the model checkpoint.\n",
+    "There is no easy way to determine this size from the checkpoint itself.\n",
+    "So we use a preprocessor to enforce this. Pay attention to the difference caused by 1000 classes instead of 1001."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import os\n",
+    "import tensorflow as tf\n",
+    "import urllib2\n",
+    "\n",
+    "from datasets import imagenet\n",
+    "from nets import vgg\n",
+    "from preprocessing import vgg_preprocessing\n",
+    "\n",
+    "slim = tf.contrib.slim\n",
+    "\n",
+    "image_size = vgg.vgg_16.default_image_size\n",
+    "\n",
+    "with tf.Graph().as_default():\n",
+    "    url = 'https://upload.wikimedia.org/wikipedia/commons/d/d9/First_Student_IC_school_bus_202076.jpg'\n",
+    "    image_string = urllib2.urlopen(url).read()\n",
+    "    image = tf.image.decode_jpeg(image_string, channels=3)\n",
+    "    processed_image = vgg_preprocessing.preprocess_image(image, image_size, image_size, is_training=False)\n",
+    "    processed_images  = tf.expand_dims(processed_image, 0)\n",
+    "    \n",
+    "    # Create the model, use the default arg scope to configure the batch norm parameters.\n",
+    "    with slim.arg_scope(vgg.vgg_arg_scope()):\n",
+    "        # 1000 classes instead of 1001.\n",
+    "        logits, _ = vgg.vgg_16(processed_images, num_classes=1000, is_training=False)\n",
+    "    probabilities = tf.nn.softmax(logits)\n",
+    "    \n",
+    "    init_fn = slim.assign_from_checkpoint_fn(\n",
+    "        os.path.join(checkpoints_dir, 'vgg_16.ckpt'),\n",
+    "        slim.get_model_variables('vgg_16'))\n",
+    "    \n",
+    "    with tf.Session() as sess:\n",
+    "        init_fn(sess)\n",
+    "        np_image, probabilities = sess.run([image, probabilities])\n",
+    "        probabilities = probabilities[0, 0:]\n",
+    "        sorted_inds = [i[0] for i in sorted(enumerate(-probabilities), key=lambda x:x[1])]\n",
+    "        \n",
+    "    plt.figure()\n",
+    "    plt.imshow(np_image.astype(np.uint8))\n",
+    "    plt.axis('off')\n",
+    "    plt.show()\n",
+    "    \n",
+    "    names = imagenet.create_readable_names_for_imagenet_labels()\n",
+    "    for i in range(5):\n",
+    "        index = sorted_inds[i]\n",
+    "        # Shift the index of a class name by one. \n",
+    "        print('Probability %0.2f%% => [%s]' % (probabilities[index], names[index+1]))"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -1015,7 +1111,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
-   "version": "2.7.6"
+   "version": "2.7.11"
  }
 },
 "nbformat": 4,