Unverified Commit 876117b5 authored by Nicolas Hug's avatar Nicolas Hug Committed by GitHub
Browse files

Remove old notebook examples and point to the online gallery (#4244)

parent a9b38db2
# Python examples # Python examples
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/vision/blob/master/examples/python/tensor_transforms.ipynb) The examples in this directory have been moved online in our [gallery
[Examples of Tensor Images transformations](https://github.com/pytorch/vision/blob/master/examples/python/tensor_transforms.ipynb) page](https://pytorch.org/vision/stable/auto_examples/index.html).
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/vision/blob/master/examples/python/video_api.ipynb) \ No newline at end of file
[Example of VideoAPI](https://github.com/pytorch/vision/blob/master/examples/python/video_api.ipynb)
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/vision/blob/master/examples/python/visualization_utils.ipynb)
[Example of Visualization Utils](https://github.com/pytorch/vision/blob/master/examples/python/visualization_utils.ipynb)
Prior to v0.8.0, transforms in torchvision have traditionally been PIL-centric and presented multiple limitations due to
that. Now, since v0.8.0, transforms implementations are Tensor and PIL compatible and we can achieve the following new
features:
- transform multi-band torch tensor images (with more than 3-4 channels)
- torchscript transforms together with your model for deployment
- support for GPU acceleration
- batched transformation such as for videos
- read and decode data directly as torch tensor with torchscript support (for PNG and JPEG image formats)
Furthermore, previously we used to provide a very high-level API for video decoding which left little control to the user. We're now expanding that API (and replacing it in the future) with a lower-level API that allows the user a frame-based access to a video.
Torchvision also provides utilities to visualize results. You can make grid of images, plot bounding boxes as well as segmentation masks. Thse utilities work standalone as well as with torchvision models for detection and segmentation.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "vjAC2mZnb4nz"
},
"source": [
"# Image transformations\n",
"\n",
"This notebook shows new features of torchvision image transformations. \n",
"\n",
"Prior to v0.8.0, transforms in torchvision have traditionally been PIL-centric and presented multiple limitations due to that. Now, since v0.8.0, transforms implementations are Tensor and PIL compatible and we can achieve the following new \n",
"features:\n",
"- transform multi-band torch tensor images (with more than 3-4 channels) \n",
"- torchscript transforms together with your model for deployment\n",
"- support for GPU acceleration\n",
"- batched transformation such as for videos\n",
"- read and decode data directly as torch tensor with torchscript support (for PNG and JPEG image formats)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "btaDWPDbgIyW",
"outputId": "8a83d408-f643-42da-d247-faf3a1bd3ae0"
},
"outputs": [],
"source": [
"import torch, torchvision\n",
"torch.__version__, torchvision.__version__"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9Vj9draNb4oA"
},
"source": [
"## Transforms on CPU/CUDA tensor images\n",
"\n",
"Let's show how to apply transformations on images opened directly as a torch tensors.\n",
"Now, torchvision provides image reading functions for PNG and JPG images with torchscript support. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Epp3hCy0b4oD"
},
"outputs": [],
"source": [
"from torchvision.datasets.utils import download_url\n",
"\n",
"download_url(\"https://farm1.static.flickr.com/152/434505223_8d1890e1e2.jpg\", \".\", \"test-image.jpg\")\n",
"download_url(\"https://farm3.static.flickr.com/2142/1896267403_24939864ba.jpg\", \".\", \"test-image2.jpg\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Y-m7lYDPb4oK"
},
"outputs": [],
"source": [
"import matplotlib.pylab as plt\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 303
},
"id": "5bi8Q7L3b4oc",
"outputId": "e5de5c73-e16d-4992-ebee-94c7ddf0bf54"
},
"outputs": [],
"source": [
"from torchvision.io.image import read_image\n",
"\n",
"tensor_image = read_image(\"test-image.jpg\")\n",
"\n",
"print(\"tensor image info: \", tensor_image.shape, tensor_image.dtype)\n",
"\n",
"plt.imshow(tensor_image.numpy().transpose((1, 2, 0)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def to_rgb_image(tensor):\n",
" \"\"\"Helper method to get RGB numpy array for plotting\"\"\"\n",
" np_img = tensor.cpu().numpy().transpose((1, 2, 0))\n",
" m1, m2 = np_img.min(axis=(0, 1)), np_img.max(axis=(0, 1))\n",
" return (255.0 * (np_img - m1) / (m2 - m1)).astype(\"uint8\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 322
},
"id": "PgWpjxQ3b4pF",
"outputId": "e9a138e8-b45c-4f75-d849-3b41de0e5472"
},
"outputs": [],
"source": [
"import torchvision.transforms as T\n",
"\n",
"# to fix random seed is now:\n",
"torch.manual_seed(12)\n",
"\n",
"transforms = T.Compose([\n",
" T.RandomCrop(224),\n",
" T.RandomHorizontalFlip(p=0.3),\n",
" T.ConvertImageDtype(torch.float),\n",
" T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])\n",
"])\n",
"\n",
"out_image = transforms(tensor_image)\n",
"print(\"output tensor image info: \", out_image.shape, out_image.dtype)\n",
"\n",
"plt.imshow(to_rgb_image(out_image))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LmYQB4cxb4pI"
},
"source": [
"Tensor images can be on GPU"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 322
},
"id": "S6syYJGEb4pN",
"outputId": "86bddb64-e648-45f2-c216-790d43cfc26d"
},
"outputs": [],
"source": [
"out_image = transforms(tensor_image.to(\"cuda\"))\n",
"print(\"output tensor image info: \", out_image.shape, out_image.dtype, out_image.device)\n",
"\n",
"plt.imshow(to_rgb_image(out_image))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jg9TQd7ajfyn"
},
"source": [
"## Scriptable transforms for easier deployment via torchscript\n",
"\n",
"Next, we show how to combine input transformations and model's forward pass and use `torch.jit.script` to obtain a single scripted module.\n",
"\n",
"**Note:** we have to use only scriptable transformations that should be derived from `torch.nn.Module`. \n",
"Since v0.8.0, all transformations are scriptable except `Compose`, `RandomChoice`, `RandomOrder`, `Lambda` and those applied on PIL images.\n",
"The transformations like `Compose` are kept for backward compatibility and can be easily replaced by existing torch modules, like `nn.Sequential`.\n",
"\n",
"Let's define a module `Predictor` that transforms input tensor and applies ImageNet pretrained resnet18 model on it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "NSDOJ3RajfvO"
},
"outputs": [],
"source": [
"import torch\n",
"import torch.nn as nn\n",
"import torchvision.transforms as T\n",
"from torchvision.io.image import read_image\n",
"from torchvision.models import resnet18\n",
"\n",
"\n",
"class Predictor(nn.Module):\n",
"\n",
" def __init__(self):\n",
" super().__init__()\n",
" self.resnet18 = resnet18(pretrained=True).eval()\n",
" self.transforms = nn.Sequential(\n",
" T.Resize([256, ]), # We use single int value inside a list due to torchscript type restrictions\n",
" T.CenterCrop(224),\n",
" T.ConvertImageDtype(torch.float),\n",
" T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])\n",
" )\n",
"\n",
" def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
" with torch.no_grad():\n",
" x = self.transforms(x)\n",
" y_pred = self.resnet18(x)\n",
" return y_pred.argmax(dim=1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZZKDovqej5vA"
},
"source": [
"Now, let's define scripted and non-scripted instances of `Predictor` and apply on multiple tensor images of the same size"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GBBMSo7vjfr0"
},
"outputs": [],
"source": [
"from torchvision.io.image import read_image\n",
"\n",
"predictor = Predictor().to(\"cuda\")\n",
"scripted_predictor = torch.jit.script(predictor).to(\"cuda\")\n",
"\n",
"\n",
"tensor_image1 = read_image(\"test-image.jpg\")\n",
"tensor_image2 = read_image(\"test-image2.jpg\")\n",
"batch = torch.stack([tensor_image1[:, -320:, :], tensor_image2[:, -320:, :]]).to(\"cuda\")\n",
"\n",
"res1 = scripted_predictor(batch)\n",
"res2 = predictor(batch)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 501
},
"id": "Dmi9r_p-oKsk",
"outputId": "b9c55e7d-5db1-4975-c485-fecc4075bf47"
},
"outputs": [],
"source": [
"import json\n",
"from torchvision.datasets.utils import download_url\n",
"\n",
"\n",
"download_url(\"https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json\", \".\", \"imagenet_class_index.json\")\n",
"\n",
"\n",
"with open(\"imagenet_class_index.json\", \"r\") as h:\n",
" labels = json.load(h)\n",
"\n",
"\n",
"plt.figure(figsize=(12, 7))\n",
"for i, p in enumerate(res1):\n",
" plt.subplot(1, 2, i + 1)\n",
" plt.title(\"Scripted predictor:\\n{label})\".format(label=labels[str(p.item())]))\n",
" plt.imshow(batch[i, ...].cpu().numpy().transpose((1, 2, 0)))\n",
"\n",
"\n",
"plt.figure(figsize=(12, 7))\n",
"for i, p in enumerate(res2):\n",
" plt.subplot(1, 2, i + 1)\n",
" plt.title(\"Original predictor:\\n{label})\".format(label=labels[str(p.item())]))\n",
" plt.imshow(batch[i, ...].cpu().numpy().transpose((1, 2, 0)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7IYsjzpFqcK8"
},
"source": [
"We save and reload scripted predictor in Python or C++ and use it for inference:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 52
},
"id": "0kk9LLw5jfol",
"outputId": "05ea6db7-7fcf-4b74-a763-5f117c14cc00"
},
"outputs": [],
"source": [
"scripted_predictor.save(\"scripted_predictor.pt\")\n",
"\n",
"scripted_predictor = torch.jit.load(\"scripted_predictor.pt\")\n",
"res1 = scripted_predictor(batch)\n",
"\n",
"for i, p in enumerate(res1):\n",
" print(\"Scripted predictor: {label})\".format(label=labels[str(p.item())]))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Data reading and decoding functions also support torch script and therefore can be part of the model as well:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class AnotherPredictor(Predictor):\n",
"\n",
" def forward(self, path: str) -> int:\n",
" with torch.no_grad():\n",
" x = read_image(path).unsqueeze(0)\n",
" x = self.transforms(x)\n",
" y_pred = self.resnet18(x)\n",
" return int(y_pred.argmax(dim=1).item())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-cMwTs3Yjffy"
},
"outputs": [],
"source": [
"scripted_predictor2 = torch.jit.script(AnotherPredictor())\n",
"\n",
"res = scripted_predictor2(\"test-image.jpg\")\n",
"\n",
"print(\"Scripted another predictor: {label})\".format(label=labels[str(res)]))"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "torchvision_scriptable_transforms.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment