Merge pull request #31 from graingert/fix-package

fix package

Merge pull request #31 from graingert/fix-package
fix package
048f6f50 · Soumith Chintala · GitHub · ce097e1a · 600250ae · 048f6f50
Commit 048f6f50 authored Jan 18, 2017 by Soumith Chintala Committed by GitHub Jan 18, 2017
Hide whitespace changes
Inline Side-by-side

Showing with 327 additions and 260 deletions

MANIFEST.in MANIFEST.in +4 -0

README.md README.md +0 -253

README.rst README.rst +316 -0

setup.cfg setup.cfg +2 -0

setup.py setup.py +5 -7

No files found.
--- a/MANIFEST.in
+++ b/MANIFEST.in
+include README.rst
+recursive-exclude * __pycache__
+recursive-exclude * *.py[co]
--- a/README.md
+++ b/README.md
-# torch-vision
-This repository consists of:
- [vision.datasets](#datasets) : Data loaders for popular vision datasets
- [vision.models](#models) : Definitions for popular model architectures, such as AlexNet, VGG, and ResNet and pre-trained models.
- [vision.transforms](#transforms) : Common image transformations such as random crop, rotations etc.
- [vision.utils](#utils) : Useful stuff such as saving tensor (3 x H x W) as image to disk, given a mini-batch creating a grid of images, etc.
-# Installation
-Binaries:
-```bash
-conda install torchvision -c https://conda.anaconda.org/t/6N-MsQ4WZ7jo/soumith
-```
-From Source:
-```bash
-pip install -r requirements.txt
-pip install .
-```
-# Datasets
-The following dataset loaders are available:
- [COCO (Captioning and Detection)](#coco)
- [LSUN Classification](#lsun)
- [ImageFolder](#imagefolder)
- [Imagenet-12](#imagenet-12)
- [CIFAR10 and CIFAR100](#cifar)
-Datasets have the API:
- `__getitem__`
- `__len__`
-They all subclass from `torch.utils.data.Dataset`
-Hence, they can all be multi-threaded (python multiprocessing) using standard torch.utils.data.DataLoader.
-For example:
-`torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)`
-In the constructor, each dataset has a slightly different API as needed, but they all take the keyword args:
- `transform` - a function that takes in an image and returns a transformed version
-  - common stuff like `ToTensor`, `RandomCrop`, etc. These can be composed together with `transforms.Compose` (see transforms section below)
- `target_transform` - a function that takes in the target and transforms it. For example, take in the caption string and return a tensor of word indices.
-### COCO
-This requires the [COCO API to be installed](https://github.com/pdollar/coco/tree/master/PythonAPI)
-#### Captions:
-`dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [transform, target_transform])`
-Example:
-```python
-import torchvision.datasets as dset
-import torchvision.transforms as transforms
-cap = dset.CocoCaptions(root = 'dir where images are',
-                        annFile = 'json annotation file',
-                        transform=transforms.ToTensor())
-print('Number of samples: ', len(cap))
-img, target = cap[3] # load 4th sample
-print("Image Size: ", img.size())
-print(target)
-```
-Output:
-```
-Number of samples: 82783
-Image Size: (3L, 427L, 640L)
-[u'A plane emitting smoke stream flying over a mountain.',
-u'A plane darts across a bright blue sky behind a mountain covered in snow',
-u'A plane leaves a contrail above the snowy mountain top.',
-u'A mountain that has a plane flying overheard in the distance.',
-u'A mountain view with a plume of smoke in the background']
-```
-#### Detection:
-`dset.CocoDetection(root="dir where images are", annFile="json annotation file", [transform, target_transform])`
-### LSUN
-`dset.LSUN(db_path, classes='train', [transform, target_transform])`
- db_path = root directory for the database files
- classes =
-  - 'train' - all categories, training set
-  - 'val' - all categories, validation set
-  - 'test' - all categories, test set
-  - ['bedroom_train', 'church_train', ...] : a list of categories to load
-### CIFAR
-`dset.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)`
-`dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=False)`
- `root` : root directory of dataset where there is folder `cifar-10-batches-py`
- `train` : `True` = Training set, `False` = Test set
- `download` : `True` = downloads the dataset from the internet and puts it in root directory. If dataset already downloaded, does not do anything.
-### ImageFolder
-A generic data loader where the images are arranged in this way:
-```
-root/dog/xxx.png
-root/dog/xxy.png
-root/dog/xxz.png
-root/cat/123.png
-root/cat/nsdf3.png
-root/cat/asd932_.png
-```
-`dset.ImageFolder(root="root folder path", [transform, target_transform])`
-It has the members:
- `self.classes` - The class names as a list
- `self.class_to_idx` - Corresponding class indices
- `self.imgs` - The list of (image path, class-index) tuples
-### Imagenet-12
-This is simply implemented with an ImageFolder dataset.
-The data is preprocessed [as described here](https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset)
-[Here is an example](https://github.com/pytorch/examples/blob/27e2a46c1d1505324032b1d94fc6ce24d5b67e97/imagenet/main.py#L48-L62).
-# Models
-The models subpackage contains definitions for the following model architectures:
- - [AlexNet](https://arxiv.org/abs/1404.5997): AlexNet variant from the "One weird trick" paper.
- - [VGG](https://arxiv.org/abs/1409.1556): VGG-11, VGG-13, VGG-16, VGG-19 (with and without batch normalization)
- - [ResNet](https://arxiv.org/abs/1512.03385): ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152
-You can construct a model with random weights by calling its constructor:
-```python
-import torchvision.models as models
-resnet18 = models.resnet18()
-alexnet = models.alexnet()
-```
- We provide pre-trained models for the ResNet variants and AlexNet, using the
- PyTorch [model zoo](http://pytorch.org/docs/model_zoo.html). These can
- be constructed by passing `pretrained=True`:
- ```python
- import torchvision.models as models
- resnet18 = models.resnet18(pretrained=True)
- alexnet = models.alexnet(pretrained=True)
-```
-# Transforms
-Transforms are common image transforms.
-They can be chained together using `transforms.Compose`
-### `transforms.Compose`
-One can compose several transforms together.
-For example.
-```python
-transform = transforms.Compose([
-    transforms.RandomSizedCrop(224),
-    transforms.RandomHorizontalFlip(),
-    transforms.ToTensor(),
-    transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
-                          std = [ 0.229, 0.224, 0.225 ]),
-])
-```
-## Transforms on PIL.Image
-### `Scale(size, interpolation=Image.BILINEAR)`
-Rescales the input PIL.Image to the given 'size'.
-'size' will be the size of the smaller edge.
-For example, if height > width, then image will be
-rescaled to (size * height / width, size)
- size: size of the smaller edge
- interpolation: Default: PIL.Image.BILINEAR
-### `CenterCrop(size)` - center-crops the image to the given size
-Crops the given PIL.Image at the center to have a region of
-the given size. size can be a tuple (target_height, target_width)
-or an integer, in which case the target will be of a square shape (size, size)
-### `RandomCrop(size, padding=0)`
-Crops the given PIL.Image at a random location to have a region of
-the given size. size can be a tuple (target_height, target_width)
-or an integer, in which case the target will be of a square shape (size, size)
-If `padding` is non-zero, then the image is first zero-padded on each side with `padding` pixels.
-### `RandomHorizontalFlip()`
-Randomly horizontally flips the given PIL.Image with a probability of 0.5
-### `RandomSizedCrop(size, interpolation=Image.BILINEAR)`
-Random crop the given PIL.Image to a random size of (0.08 to 1.0) of the original size
-and and a random aspect ratio of 3/4 to 4/3 of the original aspect ratio
-This is popularly used to train the Inception networks
- size: size of the smaller edge
- interpolation: Default: PIL.Image.BILINEAR
-### `Pad(padding, fill=0)`
-Pads the given image on each side with `padding` number of pixels, and the padding pixels are filled with
-pixel value `fill`.
-If a `5x5` image is padded with `padding=1` then it becomes `7x7`
-## Transforms on torch.*Tensor
-### `Normalize(mean, std)`
-Given mean: (R, G, B) and std: (R, G, B), will normalize each channel of the torch.*Tensor, i.e. channel = (channel - mean) / std
-## Conversion Transforms
- `ToTensor()` - Converts a PIL.Image (RGB) or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
- `ToPILImage()` - Converts a torch.*Tensor of range [0, 1] and shape C x H x W or numpy ndarray of dtype=uint8, range[0, 255] and shape H x W x C to a PIL.Image of range [0, 255]
-## Generic Transofrms
-### `Lambda(lambda)`
-Given a Python lambda, applies it to the input `img` and returns it.
-For example:
-```python
-transforms.Lambda(lambda x: x.add(10))
-```
-# Utils
-### make_grid(tensor, nrow=8, padding=2)
-Given a 4D mini-batch Tensor of shape (B x C x H x W), makes a grid of images
-### save_image(tensor, filename, nrow=8, padding=2)
-Saves a given Tensor into an image file.
-If given a mini-batch tensor, will save the tensor as a grid of images.
--- a/README.rst
+++ b/README.rst
+torch-vision
+============
+This repository consists of:
+-  `vision.datasets <#datasets>`__ : Data loaders for popular vision
+   datasets
+-  `vision.models <#models>`__ : Definitions for popular model
+   architectures, such as AlexNet, VGG, and ResNet and pre-trained
+   models.
+-  `vision.transforms <#transforms>`__ : Common image transformations
+   such as random crop, rotations etc.
+-  `vision.utils <#utils>`__ : Useful stuff such as saving tensor (3 x H
+   x W) as image to disk, given a mini-batch creating a grid of images,
+   etc.
+Installation
+============
+Binaries:
+.. code:: bash
+    conda install torchvision -c https://conda.anaconda.org/t/6N-MsQ4WZ7jo/soumith
+From Source:
+.. code:: bash
+    pip install -r requirements.txt
+    pip install .
+Datasets
+========
+The following dataset loaders are available:
+-  `COCO (Captioning and Detection) <#coco>`__
+-  `LSUN Classification <#lsun>`__
+-  `ImageFolder <#imagefolder>`__
+-  `Imagenet-12 <#imagenet-12>`__
+-  `CIFAR10 and CIFAR100 <#cifar>`__
+Datasets have the API: - ``__getitem__`` - ``__len__`` They all subclass
+from ``torch.utils.data.Dataset`` Hence, they can all be multi-threaded
+(python multiprocessing) using standard torch.utils.data.DataLoader.
+For example:
+``torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)``
+In the constructor, each dataset has a slightly different API as needed,
+but they all take the keyword args:
+-  ``transform`` - a function that takes in an image and returns a
+   transformed version
+-  common stuff like ``ToTensor``, ``RandomCrop``, etc. These can be
+   composed together with ``transforms.Compose`` (see transforms section
+   below)
+-  ``target_transform`` - a function that takes in the target and
+   transforms it. For example, take in the caption string and return a
+   tensor of word indices.
+COCO
+~~~~
+This requires the `COCO API to be
+installed <https://github.com/pdollar/coco/tree/master/PythonAPI>`__
+Captions:
+^^^^^^^^^
+``dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [transform, target_transform])``
+Example:
+.. code:: python
+    import torchvision.datasets as dset
+    import torchvision.transforms as transforms
+    cap = dset.CocoCaptions(root = 'dir where images are',
+                            annFile = 'json annotation file',
+                            transform=transforms.ToTensor())
+    print('Number of samples: ', len(cap))
+    img, target = cap[3] # load 4th sample
+    print("Image Size: ", img.size())
+    print(target)
+Output:
+::
+    Number of samples: 82783
+    Image Size: (3L, 427L, 640L)
+    [u'A plane emitting smoke stream flying over a mountain.',
+    u'A plane darts across a bright blue sky behind a mountain covered in snow',
+    u'A plane leaves a contrail above the snowy mountain top.',
+    u'A mountain that has a plane flying overheard in the distance.',
+    u'A mountain view with a plume of smoke in the background']
+Detection:
+^^^^^^^^^^
+``dset.CocoDetection(root="dir where images are", annFile="json annotation file", [transform, target_transform])``
+LSUN
+~~~~
+``dset.LSUN(db_path, classes='train', [transform, target_transform])``
+-  db\_path = root directory for the database files
+-  classes =
+-  'train' - all categories, training set
+-  'val' - all categories, validation set
+-  'test' - all categories, test set
+-  ['bedroom\_train', 'church\_train', ...] : a list of categories to
+   load
+CIFAR
+~~~~~
+``dset.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)``
+``dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=False)``
+-  ``root`` : root directory of dataset where there is folder
+   ``cifar-10-batches-py``
+-  ``train`` : ``True`` = Training set, ``False`` = Test set
+-  ``download`` : ``True`` = downloads the dataset from the internet and
+   puts it in root directory. If dataset already downloaded, does not do
+   anything.
+ImageFolder
+~~~~~~~~~~~
+A generic data loader where the images are arranged in this way:
+::
+    root/dog/xxx.png
+    root/dog/xxy.png
+    root/dog/xxz.png
+    root/cat/123.png
+    root/cat/nsdf3.png
+    root/cat/asd932_.png
+``dset.ImageFolder(root="root folder path", [transform, target_transform])``
+It has the members:
+-  ``self.classes`` - The class names as a list
+-  ``self.class_to_idx`` - Corresponding class indices
+-  ``self.imgs`` - The list of (image path, class-index) tuples
+Imagenet-12
+~~~~~~~~~~~
+This is simply implemented with an ImageFolder dataset.
+The data is preprocessed `as described
+here <https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset>`__
+`Here is an
+example <https://github.com/pytorch/examples/blob/27e2a46c1d1505324032b1d94fc6ce24d5b67e97/imagenet/main.py#L48-L62>`__.
+Models
+======
+The models subpackage contains definitions for the following model
+architectures:
+-  `AlexNet <https://arxiv.org/abs/1404.5997>`__: AlexNet variant from
+   the "One weird trick" paper.
+-  `VGG <https://arxiv.org/abs/1409.1556>`__: VGG-11, VGG-13, VGG-16,
+   VGG-19 (with and without batch normalization)
+-  `ResNet <https://arxiv.org/abs/1512.03385>`__: ResNet-18, ResNet-34,
+   ResNet-50, ResNet-101, ResNet-152
+You can construct a model with random weights by calling its
+constructor:
+.. code:: python
+    import torchvision.models as models
+    resnet18 = models.resnet18()
+    alexnet = models.alexnet()
+We provide pre-trained models for the ResNet variants and AlexNet, using
+the PyTorch `model zoo <http://pytorch.org/docs/model_zoo.html>`__.
+These can be constructed by passing ``pretrained=True``:
+``python  import torchvision.models as models  resnet18 = models.resnet18(pretrained=True)  alexnet = models.alexnet(pretrained=True)``
+Transforms
+==========
+Transforms are common image transforms. They can be chained together
+using ``transforms.Compose``
+``transforms.Compose``
+~~~~~~~~~~~~~~~~~~~~~~
+One can compose several transforms together. For example.
+.. code:: python
+    transform = transforms.Compose([
+        transforms.RandomSizedCrop(224),
+        transforms.RandomHorizontalFlip(),
+        transforms.ToTensor(),
+        transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
+                              std = [ 0.229, 0.224, 0.225 ]),
+    ])
+Transforms on PIL.Image
+~~~~~~~~~~~~~~~~~~~~~~~
+``Scale(size, interpolation=Image.BILINEAR)``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Rescales the input PIL.Image to the given 'size'. 'size' will be the
+size of the smaller edge.
+For example, if height > width, then image will be rescaled to (size \*
+height / width, size) - size: size of the smaller edge - interpolation:
+Default: PIL.Image.BILINEAR
+``CenterCrop(size)`` - center-crops the image to the given size
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Crops the given PIL.Image at the center to have a region of the given
+size. size can be a tuple (target\_height, target\_width) or an integer,
+in which case the target will be of a square shape (size, size)
+``RandomCrop(size, padding=0)``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Crops the given PIL.Image at a random location to have a region of the
+given size. size can be a tuple (target\_height, target\_width) or an
+integer, in which case the target will be of a square shape (size, size)
+If ``padding`` is non-zero, then the image is first zero-padded on each
+side with ``padding`` pixels.
+``RandomHorizontalFlip()``
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+Randomly horizontally flips the given PIL.Image with a probability of
+0.5
+``RandomSizedCrop(size, interpolation=Image.BILINEAR)``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Random crop the given PIL.Image to a random size of (0.08 to 1.0) of the
+original size and and a random aspect ratio of 3/4 to 4/3 of the
+original aspect ratio
+This is popularly used to train the Inception networks - size: size of
+the smaller edge - interpolation: Default: PIL.Image.BILINEAR
+``Pad(padding, fill=0)``
+^^^^^^^^^^^^^^^^^^^^^^^^
+Pads the given image on each side with ``padding`` number of pixels, and
+the padding pixels are filled with pixel value ``fill``. If a ``5x5``
+image is padded with ``padding=1`` then it becomes ``7x7``
+Transforms on torch.\*Tensor
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+``Normalize(mean, std)``
+^^^^^^^^^^^^^^^^^^^^^^^^
+Given mean: (R, G, B) and std: (R, G, B), will normalize each channel of
+the torch.\*Tensor, i.e. channel = (channel - mean) / std
+Conversion Transforms
+~~~~~~~~~~~~~~~~~~~~~
+-  ``ToTensor()`` - Converts a PIL.Image (RGB) or numpy.ndarray (H x W x
+   C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W)
+   in the range [0.0, 1.0]
+-  ``ToPILImage()`` - Converts a torch.\*Tensor of range [0, 1] and
+   shape C x H x W or numpy ndarray of dtype=uint8, range[0, 255] and
+   shape H x W x C to a PIL.Image of range [0, 255]
+Generic Transofrms
+~~~~~~~~~~~~~~~~~~
+``Lambda(lambda)``
+^^^^^^^^^^^^^^^^^^
+Given a Python lambda, applies it to the input ``img`` and returns it.
+For example:
+.. code:: python
+    transforms.Lambda(lambda x: x.add(10))
+Utils
+=====
+make\_grid(tensor, nrow=8, padding=2)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Given a 4D mini-batch Tensor of shape (B x C x H x W), makes a grid of
+images
+save\_image(tensor, filename, nrow=8, padding=2)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Saves a given Tensor into an image file.
+If given a mini-batch tensor, will save the tensor as a grid of images.
--- a/setup.cfg
+++ b/setup.cfg
+[bdist_wheel]
+universal=1
--- a/setup.py
+++ b/setup.py
@@ -4,12 +4,12 @@ import shutil
 import sys
 from setuptools import setup, find_packages
-VERSION = '0.1.6'
-long_description = '''torch-vision provides DataLoaders, Pre-trained models
+readme = open('README.rst').read()
-and common transforms for torch for images and videos'''
+VERSION = '0.1.6'
-setup_info = dict(
+setup(
    # Metadata
    name='torchvision',
    version=VERSION,
@@ -17,7 +17,7 @@ setup_info = dict(
    author_email='soumith@pytorch.org',
    url='https://github.com/pytorch/vision',
    description='image and video datasets and models for torch deep learning',
-    long_description=long_description,
+    long_description=readme,
    license='BSD',
    # Package info
@@ -25,5 +25,3 @@ setup_info = dict(
    zip_safe=True,
 )
-setup(**setup_info)