Commit c35271ec authored by Raymond Yuan's avatar Raymond Yuan
Browse files

minor updates

parent c4cbe63b
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Image Segmentation",
"version": "0.3.2",
"provenance": [],
"private_outputs": true,
"collapsed_sections": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "ULKtInm7dAMK",
"colab_type": "text"
"colab_type": "text",
"id": "ULKtInm7dAMK"
},
"cell_type": "markdown",
"source": [
"# Image Segmentation with `tf.keras`"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7Plun_k1dAML",
"colab_type": "text"
"colab_type": "text",
"id": "7Plun_k1dAML"
},
"cell_type": "markdown",
"source": [
"<table class=\"tfo-notebook-buttons\" align=\"left\"><td>\n",
"<a target=\"_blank\" href=\"http://colab.research.google.com/github/tensorflow/models/blob/segmentation_blogpost/samples/outreach/blogs/segmentation_blogpost/image_segmentation.ipynb\">\n",
......@@ -41,11 +25,11 @@
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cl79rk4KKol8",
"colab_type": "text"
"colab_type": "text",
"id": "cl79rk4KKol8"
},
"cell_type": "markdown",
"source": [
"In this tutorial we will learn how to segment images. **Segmentation** is the process of generating pixel-wise segmentations giving the class of the object visible at each pixel. For example, we could be identifying the location and boundaries of people within an image or identifying cell nuclei from an image. Formally, image segmentation refers to the process of partitioning an image into a set of pixels that we desire to identify (our target) and the background. \n",
"\n",
......@@ -79,25 +63,27 @@
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "bJcQiA3OdCY6",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "bJcQiA3OdCY6"
},
"cell_type": "code",
"outputs": [],
"source": [
"!pip install kaggle"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "ODNLPGHKKgr-",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "ODNLPGHKKgr-"
},
"cell_type": "code",
"outputs": [],
"source": [
"import os\n",
"import glob\n",
......@@ -114,17 +100,17 @@
"import matplotlib.image as mpimg\n",
"import pandas as pd\n",
"from PIL import Image\n"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "YQ9VRReUQxXi",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "YQ9VRReUQxXi"
},
"cell_type": "code",
"outputs": [],
"source": [
"import tensorflow as tf\n",
"import tensorflow.contrib as tfcontrib\n",
......@@ -132,28 +118,28 @@
"from tensorflow.python.keras import losses\n",
"from tensorflow.python.keras import models\n",
"from tensorflow.python.keras import backend as K "
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RW9gk331S0KA",
"colab_type": "text"
"colab_type": "text",
"id": "RW9gk331S0KA"
},
"cell_type": "markdown",
"source": [
"# Get all the files \n",
"Since this tutorial will be using a dataset from Kaggle, it requires [creating an API Token](https://github.com/Kaggle/kaggle-api#api-credentials) for your Kaggle acccount, and uploading it. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "sAVM1ZTmdAMR",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "sAVM1ZTmdAMR"
},
"cell_type": "code",
"outputs": [],
"source": [
"import os\n",
"\n",
......@@ -182,64 +168,64 @@
" os.chmod(token_file, 600)\n",
"\n",
"get_kaggle_credentials()\n"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gh6jkMp8dN5B",
"colab_type": "text"
"colab_type": "text",
"id": "gh6jkMp8dN5B"
},
"cell_type": "markdown",
"source": [
"Only import kaggle after adding the credentials."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "EoWJ1hb9dOV_",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "EoWJ1hb9dOV_"
},
"cell_type": "code",
"outputs": [],
"source": [
"import kaggle"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wC-byMdadAMT",
"colab_type": "text"
"colab_type": "text",
"id": "wC-byMdadAMT"
},
"cell_type": "markdown",
"source": [
"### We'll download the data from Kaggle\n",
"Caution, large download ahead - downloading all files will require 14GB of diskspace. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "6MOTOyU3dAMU",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "6MOTOyU3dAMU"
},
"cell_type": "code",
"outputs": [],
"source": [
"competition_name = 'carvana-image-masking-challenge'"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "3gJSCmWjdAMW",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "3gJSCmWjdAMW"
},
"cell_type": "code",
"outputs": [],
"source": [
"# Download data from Kaggle and create a DataFrame.\n",
"def load_data_from_zip(competition, file):\n",
......@@ -253,163 +239,163 @@
" load_data_from_zip(competition, 'train_masks.zip')\n",
" load_data_from_zip(competition, 'train_masks.csv.zip')\n",
" \n"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "l5SZJKPRdXNX",
"colab_type": "text"
"colab_type": "text",
"id": "l5SZJKPRdXNX"
},
"cell_type": "markdown",
"source": [
"You must [accept the competition rules](https://www.kaggle.com/c/carvana-image-masking-challenge/rules) before downloading the data."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "_SsQjuN2dWmU",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "_SsQjuN2dWmU"
},
"cell_type": "code",
"outputs": [],
"source": [
"get_data(competition_name)"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "wT1kb3q0ghhi",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "wT1kb3q0ghhi"
},
"cell_type": "code",
"outputs": [],
"source": [
"img_dir = os.path.join(competition_name, \"train\")\n",
"label_dir = os.path.join(competition_name, \"train_masks\")"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "9ej-e6cqmRgd",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "9ej-e6cqmRgd"
},
"cell_type": "code",
"outputs": [],
"source": [
"df_train = pd.read_csv(os.path.join(competition_name, 'train_masks.csv'))\n",
"ids_train = df_train['img'].map(lambda s: s.split('.')[0])"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "33i4xFXweztH",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "33i4xFXweztH"
},
"cell_type": "code",
"outputs": [],
"source": [
"x_train_filenames = []\n",
"y_train_filenames = []\n",
"for img_id in ids_train:\n",
" x_train_filenames.append(os.path.join(img_dir, \"{}.jpg\".format(img_id)))\n",
" y_train_filenames.append(os.path.join(label_dir, \"{}_mask.gif\".format(img_id)))"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "DtutNudKbf70",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "DtutNudKbf70"
},
"cell_type": "code",
"outputs": [],
"source": [
"x_train_filenames, x_val_filenames, y_train_filenames, y_val_filenames = \\\n",
" train_test_split(x_train_filenames, y_train_filenames, test_size=0.2, random_state=42)"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "zDycQekHaMqq",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "zDycQekHaMqq"
},
"cell_type": "code",
"outputs": [],
"source": [
"num_train_examples = len(x_train_filenames)\n",
"num_val_examples = len(x_val_filenames)\n",
"\n",
"print(\"Number of training examples: {}\".format(num_train_examples))\n",
"print(\"Number of validation examples: {}\".format(num_val_examples))"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Nhda5fkPS3JD",
"colab_type": "text"
"colab_type": "text",
"id": "Nhda5fkPS3JD"
},
"cell_type": "markdown",
"source": [
"### Here's what the paths look like "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "Di1N83ArilzR",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "Di1N83ArilzR"
},
"cell_type": "code",
"outputs": [],
"source": [
"x_train_filenames[:10]"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "Gc-BDv1Zio1z",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "Gc-BDv1Zio1z"
},
"cell_type": "code",
"outputs": [],
"source": [
"y_train_filenames[:10]"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mhvDoZkbcUa1",
"colab_type": "text"
"colab_type": "text",
"id": "mhvDoZkbcUa1"
},
"cell_type": "markdown",
"source": [
"# Visualize\n",
"Let's take a look at some of the examples of different images in our dataset. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "qUA6SDLhozjj",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "qUA6SDLhozjj"
},
"cell_type": "code",
"outputs": [],
"source": [
"display_num = 5\n",
"\n",
......@@ -434,51 +420,49 @@
" \n",
"plt.suptitle(\"Examples of Images and their Masks\")\n",
"plt.show()"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "d4CPgvPiToB_",
"colab_type": "text"
"colab_type": "text",
"id": "d4CPgvPiToB_"
},
"cell_type": "markdown",
"source": [
"# Set up "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HfeMRgyoa2n6",
"colab_type": "text"
"colab_type": "text",
"id": "HfeMRgyoa2n6"
},
"cell_type": "markdown",
"source": [
"Let’s begin by setting up some parameters. We’ll standardize and resize all the shapes of the images. We’ll also set up some training parameters: "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "oeDoiSFlothe",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "oeDoiSFlothe"
},
"cell_type": "code",
"outputs": [],
"source": [
"img_shape = (256, 256, 3)\n",
"batch_size = 3\n",
"epochs = 5"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8_d5ATP21npW",
"colab_type": "text"
"colab_type": "text",
"id": "8_d5ATP21npW"
},
"cell_type": "markdown",
"source": [
"Using these exact same parameters may be too computationally intensive for your hardware, so tweak the parameters accordingly. Also, it is important to note that due to the architecture of our UNet version, the size of the image must be evenly divisible by a factor of 32, as we down sample the spatial resolution by a factor of 2 with each `MaxPooling2Dlayer`.\n",
"\n",
......@@ -490,11 +474,11 @@
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_HONB9JbXxDM",
"colab_type": "text"
"colab_type": "text",
"id": "_HONB9JbXxDM"
},
"cell_type": "markdown",
"source": [
"# Build our input pipeline with `tf.data`\n",
"Since we begin with filenames, we will need to build a robust and scalable data pipeline that will play nicely with our model. If you are unfamiliar with **tf.data** you should check out my other tutorial introducing the concept! \n",
......@@ -516,33 +500,35 @@
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EtRA8vILbx2_",
"colab_type": "text"
"colab_type": "text",
"id": "EtRA8vILbx2_"
},
"cell_type": "markdown",
"source": [
"#### Why do we do these image transformations?\n",
"This is known as **data augmentation**. Data augmentation \"increases\" the amount of training data by augmenting them via a number of random transformations. During training time, our model would never see twice the exact same picture. This helps prevent [overfitting](https://developers.google.com/machine-learning/glossary/#overfitting) and helps the model generalize better to unseen data."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3aGi28u8Cq9M",
"colab_type": "text"
"colab_type": "text",
"id": "3aGi28u8Cq9M"
},
"cell_type": "markdown",
"source": [
"## Processing each pathname"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "Fb_psznAggwr",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "Fb_psznAggwr"
},
"cell_type": "code",
"outputs": [],
"source": [
"def _process_pathnames(fname, label_path):\n",
" # We map this function onto each pathname pair \n",
......@@ -557,27 +543,27 @@
" label_img = label_img[:, :, 0]\n",
" label_img = tf.expand_dims(label_img, axis=-1)\n",
" return img, label_img"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Y4UE28JiCuOk",
"colab_type": "text"
"colab_type": "text",
"id": "Y4UE28JiCuOk"
},
"cell_type": "markdown",
"source": [
"## Shifting the image"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "xdY046OqtGVH",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "xdY046OqtGVH"
},
"cell_type": "code",
"outputs": [],
"source": [
"def shift_img(output_img, label_img, width_shift_range, height_shift_range):\n",
" \"\"\"This fn will perform the horizontal or vertical shift\"\"\"\n",
......@@ -596,27 +582,27 @@
" label_img = tfcontrib.image.translate(label_img,\n",
" [width_shift_range, height_shift_range])\n",
" return output_img, label_img"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qY253aZfCwd2",
"colab_type": "text"
"colab_type": "text",
"id": "qY253aZfCwd2"
},
"cell_type": "markdown",
"source": [
"## Flipping the image randomly "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "OogLSplstur9",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "OogLSplstur9"
},
"cell_type": "code",
"outputs": [],
"source": [
"def flip_img(horizontal_flip, tr_img, label_img):\n",
" if horizontal_flip:\n",
......@@ -625,27 +611,27 @@
" lambda: (tf.image.flip_left_right(tr_img), tf.image.flip_left_right(label_img)),\n",
" lambda: (tr_img, label_img))\n",
" return tr_img, label_img"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_YIJLIr5Cyyr",
"colab_type": "text"
"colab_type": "text",
"id": "_YIJLIr5Cyyr"
},
"cell_type": "markdown",
"source": [
"## Assembling our transformations into our augment function"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "18WA0Sl3olyn",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "18WA0Sl3olyn"
},
"cell_type": "code",
"outputs": [],
"source": [
"def _augment(img,\n",
" label_img,\n",
......@@ -668,17 +654,17 @@
" label_img = tf.to_float(label_img) * scale\n",
" img = tf.to_float(img) * scale \n",
" return img, label_img"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "tkNqQaR2HQbd",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "tkNqQaR2HQbd"
},
"cell_type": "code",
"outputs": [],
"source": [
"def get_baseline_dataset(filenames, \n",
" labels,\n",
......@@ -704,28 +690,28 @@
" # It's necessary to repeat our data for all epochs \n",
" dataset = dataset.repeat().batch(batch_size)\n",
" return dataset"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zwtgius5CRKc",
"colab_type": "text"
"colab_type": "text",
"id": "zwtgius5CRKc"
},
"cell_type": "markdown",
"source": [
"## Set up train and validation datasets\n",
"Note that we apply image augmentation to our training dataset but not our validation dataset. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "iu5WmYmOwKrV",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "iu5WmYmOwKrV"
},
"cell_type": "code",
"outputs": [],
"source": [
"tr_cfg = {\n",
" 'resize': [img_shape[0], img_shape[1]],\n",
......@@ -736,34 +722,34 @@
" 'height_shift_range': 0.1\n",
"}\n",
"tr_preprocessing_fn = functools.partial(_augment, **tr_cfg)"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "RtzLkDFMpF0T",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "RtzLkDFMpF0T"
},
"cell_type": "code",
"outputs": [],
"source": [
"val_cfg = {\n",
" 'resize': [img_shape[0], img_shape[1]],\n",
" 'scale': 1 / 255.,\n",
"}\n",
"val_preprocessing_fn = functools.partial(_augment, **val_cfg)"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "5cNpECdkaafo",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "5cNpECdkaafo"
},
"cell_type": "code",
"outputs": [],
"source": [
"train_ds = get_baseline_dataset(x_train_filenames,\n",
" y_train_filenames,\n",
......@@ -773,27 +759,27 @@
" y_val_filenames, \n",
" preproc_fn=val_preprocessing_fn,\n",
" batch_size=batch_size)"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Yasuvr5IbFlM",
"colab_type": "text"
"colab_type": "text",
"id": "Yasuvr5IbFlM"
},
"cell_type": "markdown",
"source": [
"## Let's see if our image augmentor data pipeline is producing expected results"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "hjoUqbPdHQej",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "hjoUqbPdHQej"
},
"cell_type": "code",
"outputs": [],
"source": [
"temp_ds = get_baseline_dataset(x_train_filenames, \n",
" y_train_filenames,\n",
......@@ -816,42 +802,14 @@
" plt.subplot(1, 2, 2)\n",
" plt.imshow(label[0, :, :, 0])\n",
" plt.show()"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "xszBW-gL1Cyq",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"label.shape"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "x1LfMEWjkluS",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"train_ds"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fvtxCncKsoRd",
"colab_type": "text"
"colab_type": "text",
"id": "fvtxCncKsoRd"
},
"cell_type": "markdown",
"source": [
"# Build the model\n",
"We'll build the U-Net model. U-Net is especially good with segmentation tasks because it can localize well to provide high resolution segmentation masks. In addition, it works well with small datasets and is relatively robust against overfitting as the training data is in terms of the number of patches within an image, which is much larger than the number of training images itself. Unlike the original model, we will add batch normalization to each of our blocks. \n",
......@@ -866,12 +824,14 @@
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "zfew1i1F6bK-",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "zfew1i1F6bK-"
},
"cell_type": "code",
"outputs": [],
"source": [
"def conv_block(input_tensor, num_filters):\n",
" encoder = layers.Conv2D(num_filters, (3, 3), padding='same')(input_tensor)\n",
......@@ -900,17 +860,17 @@
" decoder = layers.BatchNormalization()(decoder)\n",
" decoder = layers.Activation('relu')(decoder)\n",
" return decoder"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "xRLp21S_hpTn",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "xRLp21S_hpTn"
},
"cell_type": "code",
"outputs": [],
"source": [
"inputs = layers.Input(shape=img_shape)\n",
"# 256\n",
......@@ -949,51 +909,49 @@
"# 256\n",
"\n",
"outputs = layers.Conv2D(1, (1, 1), activation='sigmoid')(decoder0)"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "luDqDqu8c1AX",
"colab_type": "text"
"colab_type": "text",
"id": "luDqDqu8c1AX"
},
"cell_type": "markdown",
"source": [
"## Define your model\n",
"Using functional API, you must define your model by specifying the inputs and outputs associated with the model. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "76QkTzXVczgc",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "76QkTzXVczgc"
},
"cell_type": "code",
"outputs": [],
"source": [
"model = models.Model(inputs=[inputs], outputs=[outputs])"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "p0tNnmyOdtyr",
"colab_type": "text"
"colab_type": "text",
"id": "p0tNnmyOdtyr"
},
"cell_type": "markdown",
"source": [
"# Defining custom metrics and loss functions\n",
"Defining loss and metric functions are simple with Keras. Simply define a function that takes both the True labels for a given example and the Predicted labels for the same given example. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sfuBVut0fogM",
"colab_type": "text"
"colab_type": "text",
"id": "sfuBVut0fogM"
},
"cell_type": "markdown",
"source": [
"Dice loss is a metric that measures overlap. More info on optimizing for Dice coefficient (our dice loss) can be found in the [paper](http://campar.in.tum.de/pub/milletari2016Vnet/milletari2016Vnet.pdf), where it was introduced. \n",
"\n",
......@@ -1001,12 +959,14 @@
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "t_8_hbHECUAW",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "t_8_hbHECUAW"
},
"cell_type": "code",
"outputs": [],
"source": [
"def dice_coeff(y_true, y_pred):\n",
" smooth = 1.\n",
......@@ -1016,82 +976,80 @@
" intersection = tf.reduce_sum(y_true_f * y_pred_f)\n",
" score = (2. * intersection + smooth) / (tf.reduce_sum(y_true_f) + tf.reduce_sum(y_pred_f) + smooth)\n",
" return score"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "4DgINhlpNaxP",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "4DgINhlpNaxP"
},
"cell_type": "code",
"outputs": [],
"source": [
"def dice_loss(y_true, y_pred):\n",
" loss = 1 - dice_coeff(y_true, y_pred)\n",
" return loss"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qqClGNFJdANU",
"colab_type": "text"
"colab_type": "text",
"id": "qqClGNFJdANU"
},
"cell_type": "markdown",
"source": [
"Here, we'll use a specialized loss function that combines binary cross entropy and our dice loss. This is based on [individuals who competed within this competition obtaining better results empirically](https://www.kaggle.com/c/carvana-image-masking-challenge/discussion/40199). "
"Here, we'll use a specialized loss function that combines binary cross entropy and our dice loss. This is based on [individuals who competed within this competition obtaining better results empirically](https://www.kaggle.com/c/carvana-image-masking-challenge/discussion/40199). Try out your own custom losses to measure performance (e.g. bce + log(dice_loss), only bce, etc.)!"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "udrfi9JGB-bL",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "udrfi9JGB-bL"
},
"cell_type": "code",
"outputs": [],
"source": [
"def bce_dice_loss(y_true, y_pred):\n",
" loss = losses.binary_crossentropy(y_true, y_pred) + dice_loss(y_true, y_pred)\n",
" return loss"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LifmpjXNc9Gz",
"colab_type": "text"
"colab_type": "text",
"id": "LifmpjXNc9Gz"
},
"cell_type": "markdown",
"source": [
"## Compile your model\n",
"We use our custom loss function to minimize. In addition, we specify what metrics we want to keep track of as we train. Note that metrics are not actually used during the training process to tune the parameters, but are instead used to measure performance of the training process. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "gflcWk2Cc8Bi",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "gflcWk2Cc8Bi"
},
"cell_type": "code",
"outputs": [],
"source": [
"model.compile(optimizer='adam', loss=bce_dice_loss, metrics=[dice_loss])\n",
"\n",
"model.summary()"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8WG_8iZ_dMbK",
"colab_type": "text"
"colab_type": "text",
"id": "8WG_8iZ_dMbK"
},
"cell_type": "markdown",
"source": [
"## Train your model\n",
"Training your model with `tf.data` involves simply providing the model's `fit` function with your training/validation dataset, the number of steps, and epochs. \n",
......@@ -1100,36 +1058,38 @@
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "1nHnj6199elZ",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "1nHnj6199elZ"
},
"cell_type": "code",
"outputs": [],
"source": [
"save_model_path = '/tmp/weights.hdf5'\n",
"cp = tf.keras.callbacks.ModelCheckpoint(filepath=save_model_path, monitor='val_dice_loss', mode='max', save_best_only=True)"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vJP_EvuTb4hH",
"colab_type": "text"
"colab_type": "text",
"id": "vJP_EvuTb4hH"
},
"cell_type": "markdown",
"source": [
"Don't forget to specify our model callback in the `fit` function call. "
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "UMZcOrq5aaj1",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "UMZcOrq5aaj1"
},
"cell_type": "code",
"outputs": [],
"source": [
"history = model.fit(train_ds, \n",
" steps_per_epoch=int(np.ceil(num_train_examples / float(batch_size))),\n",
......@@ -1137,27 +1097,27 @@
" validation_data=val_ds,\n",
" validation_steps=int(np.ceil(num_val_examples / float(batch_size))),\n",
" callbacks=[cp])"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gCAUsoxfTTrh",
"colab_type": "text"
"colab_type": "text",
"id": "gCAUsoxfTTrh"
},
"cell_type": "markdown",
"source": [
"# Visualize training process"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "AvntxymYn8rM",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "AvntxymYn8rM"
},
"cell_type": "code",
"outputs": [],
"source": [
"dice = = history.history['dice_loss']\n",
"val_dice = history.history['val_dice_loss']\n",
......@@ -1181,26 +1141,24 @@
"plt.title('Training and Validation Loss')\n",
"\n",
"plt.show()"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dWPhb87GdhkG",
"colab_type": "text"
"colab_type": "text",
"id": "dWPhb87GdhkG"
},
"cell_type": "markdown",
"source": [
"Even with only 5 epochs, we see strong performance."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MGFKf8yCTYbw",
"colab_type": "text"
"colab_type": "text",
"id": "MGFKf8yCTYbw"
},
"cell_type": "markdown",
"source": [
"# Visualize actual performance \n",
"We'll visualize our performance on the validation set.\n",
......@@ -1209,11 +1167,11 @@
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oIddsUcM_KeI",
"colab_type": "text"
"colab_type": "text",
"id": "oIddsUcM_KeI"
},
"cell_type": "markdown",
"source": [
"To load our model we have two options:\n",
"1. Since our model architecture is already in memory, we can simply call `load_weights(save_model_path)`\n",
......@@ -1225,27 +1183,29 @@
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "5Ph7acmrCXm6",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "5Ph7acmrCXm6"
},
"cell_type": "code",
"outputs": [],
"source": [
"# Alternatively, load the weights directly: model.load_weights(save_model_path)\n",
"model = models.load_model(save_model_path, custom_objects={'bce_dice_loss': bce_dice_loss,\n",
" 'dice_coeff': dice_coeff})"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"id": "0GnwZ7CPaamI",
"colab": {},
"colab_type": "code",
"colab": {}
"id": "0GnwZ7CPaamI"
},
"cell_type": "code",
"outputs": [],
"source": [
"# Let's visualize some of the outputs \n",
"data_aug_iter = val_ds.make_one_shot_iterator()\n",
......@@ -1268,19 +1228,16 @@
" plt.subplot(5, 3, 3 * i + 3)\n",
" plt.imshow(predicted_label[:, :, 0])\n",
" plt.title(\"Predicted Mask\")\n",
"plt.save\n",
"plt.suptitle(\"Examples of Input Image, Label, and Prediction\")\n",
"plt.show()"
],
"execution_count": 0,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iPV7RMA9TjPC",
"colab_type": "text"
"colab_type": "text",
"id": "iPV7RMA9TjPC"
},
"cell_type": "markdown",
"source": [
"# Key Takeaways\n",
"In this tutorial we learned how to train a network to automatically detect and create cutouts of cars from images! \n",
......@@ -1292,5 +1249,34 @@
"* **Save and load our model** - We saved our best model that we encountered according to our specified metric. When we wanted to perform inference with out best model, we loaded it from disk. Note that saving the model capture more than just the weights of the model: by default, it saves the model architecture, weights, as well as information about the training process such as the state of the optimizer, etc. "
]
}
]
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "Image Segmentation",
"private_outputs": true,
"provenance": [],
"version": "0.3.2"
},
"kernelspec": {
"display_name": "Python [default]",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment