"vscode:/vscode.git/clone" did not exist on "41f8a162f46aa603784e8b647de2d380e346fcb6"
Commit 8fbc9bb6 authored by Hang Zhang's avatar Hang Zhang
Browse files

v0.2.0

parent 01946d40
# PyTorch-Encoding
created by [Hang Zhang](http://hangzh.com/)
## [Documentation](http://hangzh.com/PyTorch-Encoding/)
- Please visit the [**Docs**](http://hangzh.com/PyTorch-Encoding/) for detail instructions of installation and usage.
- [**Link**](http://hangzh.com/PyTorch-Encoding/experiments/texture.html) to the Deep TEN texture classification experiments and pre-trained models.
## Citations
## Citation
**Context Encoding for Semantic Segmentation**
[Hang Zhang](http://hangzh.com/), [Kristin Dana](http://eceweb1.rutgers.edu/vision/dana.html), [Jianping Shi](http://shijianping.me/), [Zhongyue Zhang](http://zhongyuezhang.com/), [Xiaogang Wang](http://www.ee.cuhk.edu.hk/~xgwang/), [Ambrish Tyagi](https://scholar.google.com/citations?user=GaSWCoUAAAAJ&hl=en), [Amit Agrawal](http://www.amitkagrawal.com/)
```
@InProceedings{Zhang_2018_CVPR,
author = {Zhang, Hang and Dana, Kristin and Shi, Jianping and Zhang, Zhongyue and Wang, Xiaogang and Tyagi, Ambrish and Agrawal, Amit},
title = {Context Encoding for Semantic Segmentation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}
```
**Deep TEN: Texture Encoding Network** [[arXiv]](https://arxiv.org/pdf/1612.02844.pdf)
[Hang Zhang](http://hangzh.com/), [Jia Xue](http://jiaxueweb.com/), [Kristin Dana](http://eceweb1.rutgers.edu/vision/dana.html)
......
......@@ -29,6 +29,7 @@ if platform.system() == 'Darwin':
ENCODING_LIB = os.path.join(cwd, 'encoding/lib/libENCODING.dylib')
else:
os.environ['CFLAGS'] = '-std=c99'
os.environ['TH_LIBRARIES'] = os.path.join(lib_path,'libATen.so.1')
ENCODING_LIB = os.path.join(cwd, 'encoding/lib/libENCODING.so')
......
......@@ -8,20 +8,18 @@ Deep TEN: Deep Texture Encoding Network Example
In this section, we show an example of training/testing Encoding-Net for texture recognition on MINC-2500 dataset. Comparing to original Torch implementation, we use *different learning rate* for pre-trained base network and encoding layer (10x), disable color jittering after reducing lr and adopt much *smaller training image size* (224 instead of 352).
.. note::
**Make Sure** to `Install PyTorch Encoding <../notes/compile.html>`_ First.
Test Pre-trained Model
----------------------
- Clone the GitHub repo (I am sure you did during the installation)::
- Clone the GitHub repo::
git clone git@github.com:zhanghang1989/PyTorch-Encoding.git
- Install PyTorch Encoding (if not yet). Please follow the installation guide `Installing PyTorch Encoding <../notes/compile.html>`_.
- Download the `MINC-2500 <http://opensurfaces.cs.cornell.edu/publications/minc/>`_ dataset to ``$HOME/data/minc-2500/`` folder. Download pre-trained model (training `curve`_ as bellow, pre-trained on train-1 split using single training size of 224, with an error rate of :math:`19.98\%` using single crop on test-1 set)::
cd PyTorch-Encoding/experiments
cd PyTorch-Encoding/experiments/recognition
bash model/download_models.sh
.. _curve:
......@@ -41,14 +39,14 @@ Train Your Own Model
- Example training command for training above model::
python main.py --model deepten --nclass 23 --model encodingnet --batch-size 64 --lr 0.01 --epochs 60
python main.py --model deepten --nclass 23 --model deepten --batch-size 64 --lr 0.01 --epochs 60
- Training options::
- Detail training options::
-h, --help show this help message and exit
--dataset DATASET training dataset (default: cifar10)
--model MODEL network model type (default: densenet)
--widen N widen factor of the network (default: 4)
--backbone BACKBONE backbone name (default: resnet50)
--batch-size N batch size for training (default: 128)
--test-batch-size N batch size for testing (default: 1000)
--epochs N number of epochs to train (default: 300)
......@@ -69,7 +67,7 @@ Train Your Own Model
Extending the Software
----------------------
This code includes an integrated pipeline and some visualization tools (progress bar, real-time training curve plots). It is easy to use and extend for your own model or dataset:
This code is well written, easy to use and extendable for your own models or datasets:
- Write your own Dataloader ``mydataset.py`` to ``dataset/`` folder
......
......@@ -25,7 +25,18 @@ Reference
---------
.. note::
If using the code in your research, please cite our paper.
If using the code in your research, please cite our papers.
* Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. "Context Encoding for Semantic Segmentation" *The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018*::
@InProceedings{Zhang_2018_CVPR,
author = {Zhang, Hang and Dana, Kristin and Shi, Jianping and Zhang, Zhongyue and Wang, Xiaogang and Tyagi, Ambrish and Agrawal, Amit},
title = {Context Encoding for Semantic Segmentation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}
* Hang Zhang, Jia Xue, and Kristin Dana. "Deep TEN: Texture Encoding Network." *The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017*::
......
Implementing Synchronized Multi-GPU Batch Normalization
=======================================================
In this tutorial, we discuss the implementation detail of Multi-GPU Batch Normalization (BN) :class:`encoding.nn.BatchNorm2d` and compatible :class:`encoding.parallel.SelfDataParallel`. We will provide the training example in a later version.
In this tutorial, we discuss the implementation detail of Multi-GPU Batch Normalization (BN) (classic implementation: :class:`encoding.nn.BatchNorm2d` and compatible :class:`encoding.parallel.SelfDataParallel`). We will provide the training example in a later version.
How BN works?
-------------
......@@ -23,7 +23,7 @@ BN layer was introduced in the paper `Batch Normalization: Accelerating Deep Net
\frac{d_\ell}{d_{x_i}} = \frac{d_\ell}{d_{y_i}}\cdot\frac{d_{y_i}}{d_{x_i}} + \frac{d_\ell}{d_\mu}\cdot\frac{d_\mu}{d_{x_i}} + \frac{d_\ell}{d_\sigma}\cdot\frac{d_\sigma}{d_{x_i}}
where :math:`\frac{d_\ell}{d_{x_i}}=\frac{\gamma}{\sigma}, \frac{d_\ell}{d_\mu}=-\frac{\gamma}{\sigma}\sum_i^N\frac{d_\ell}{d_{y_i}}
where :math:`\frac{d_{y_i}}{d_{x_i}}=\frac{\gamma}{\sigma}, \frac{d_\ell}{d_\mu}=-\frac{\gamma}{\sigma}\sum_i^N\frac{d_\ell}{d_{y_i}}
\text{ and } \frac{d_\sigma}{d_{x_i}}=-\frac{1}{\sigma}(\frac{x_i-\mu}{N})`.
Why Synchronize BN?
......@@ -49,6 +49,9 @@ Suppose we have :math:`K` number of GPUs, :math:`sum(x)_k` and :math:`sum(x^2)_k
* Then Sync the gradient (automatically handled by :class:`encoding.parallel.AllReduce`) and continue the backward.
Classic Implementation
~~~~~~~~~~~~~~~~~~~~~~
- Synchronized DataParallel:
Standard DataParallel pipeline of public frameworks (MXNet, PyTorch...) in each training iters:
......
......@@ -62,7 +62,6 @@ IF(MSVC)
ENDIF()
TARGET_LINK_LIBRARIES(ENCODING
${THC_LIBRARIES}
${TH_LIBRARIES}
${CUDA_cusparse_LIBRARY}
)
......
......@@ -36,4 +36,3 @@ SET(Torch_INSTALL_INCLUDE "${TORCH_BUILD_DIR}/include" ${TORCH_TH_INCLUDE_DIR} $
# Find the libs. We need to find libraries one by one.
SET(TH_LIBRARIES "$ENV{TH_LIBRARIES}")
SET(THC_LIBRARIES "$ENV{THC_LIBRARIES}")
......@@ -120,7 +120,9 @@ class _Transition(nn.Sequential):
class DenseNet(nn.Module):
r"""Dilated Densenet-BC model class
r"""Dilated DenseNet.
For correctly dilation of transition layer fo DenseNet, we implement the :class:`encoding.nn.DilatedAvgPool2d`.
Args:
growth_rate (int) - how many filters to add each layer (`k` in paper)
......
......@@ -62,9 +62,9 @@ def sum_square(input):
class _batchnorm(Function):
def __init__(self, training=False):
super(_batchnorm, self).__init__()
self.training = training
def __init__(ctx, training=False):
super(_batchnorm, ctx).__init__()
ctx.training = training
def forward(ctx, input, gamma, beta, mean, std):
ctx.save_for_backward(input, gamma, beta, mean, std)
......@@ -99,13 +99,13 @@ class _batchnorm(Function):
encoding_lib.Encoding_Float_batchnorm_Backward(
gradOutput, input, gradInput, gradGamma, gradBeta,
mean, invstd, gamma, beta, gradMean, gradStd,
self.training)
ctx.training)
elif isinstance(input, torch.cuda.DoubleTensor):
with torch.cuda.device_of(input):
encoding_lib.Encoding_Double_batchnorm_Backward(
gradOutput, input, gradInput, gradGamma, gradBeta,
mean, invstd, gamma, beta, gradMean, gradStd,
self.training)
ctx.training)
else:
raise RuntimeError('Unimplemented data type!')
return gradInput, gradGamma, gradBeta, gradMean, gradStd
......
......@@ -35,7 +35,7 @@ __global__ void Encoding_(DilatedAvgPool_Forward_kernel) (
c = bc - b*C;
/* boundary check for output */
if (w >= Y.getSize(3) || h >= Y.getSize(2)) return;
int hstart = h*dW -padH;
int hstart = h*dH -padH;
int wstart = w*dW -padW;
int hend = min(hstart + kH*dilationH, X.getSize(2));
int wend = min(wstart + kW*dilationW, X.getSize(3));
......
......@@ -13,13 +13,14 @@ import torch
from torch.nn import Module, Parameter
import torch.nn.functional as F
from torch.autograd import Function, Variable
from torch.nn.modules.utils import _single, _pair, _triple
from .._ext import encoding_lib
from ..functions import scaledL2, aggregate
from ..parallel import my_data_parallel
from ..functions import dilatedavgpool2d
__all__ = ['Encoding', 'EncodingShake', 'Inspiration', 'DilatedAvgPool2d', 'UpsampleConv2d']
__all__ = ['Encoding', 'EncodingDrop', 'Inspiration', 'DilatedAvgPool2d', 'UpsampleConv2d']
class Encoding(Module):
r"""
......@@ -104,9 +105,9 @@ class Encoding(Module):
+ 'N x ' + str(self.D) + '=>' + str(self.K) + 'x' \
+ str(self.D) + ')'
class EncodingShake(Module):
class EncodingDrop(Module):
def __init__(self, D, K):
super(EncodingShake, self).__init__()
super(EncodingDrop, self).__init__()
# init codewords and smoothing factor
self.D, self.K = D, K
self.codewords = Parameter(torch.Tensor(K, D),
......@@ -119,7 +120,7 @@ class EncodingShake(Module):
self.codewords.data.uniform_(-std1, std1)
self.scale.data.uniform_(-1, 0)
def shake(self):
def _drop(self):
if self.training:
self.scale.data.uniform_(-1, 0)
else:
......@@ -143,14 +144,12 @@ class EncodingShake(Module):
X = X.view(B,D,-1).transpose(1,2).contiguous()
else:
raise RuntimeError('Encoding Layer unknown input dims!')
# shake
self.shake()
self._drop()
# assignment weights
A = F.softmax(scaledL2(X, self.codewords, self.scale), dim=1)
# aggregate
E = aggregate(A, X, self.codewords)
# shake
self.shake()
self._drop()
return E
def __repr__(self):
......@@ -202,27 +201,27 @@ class DilatedAvgPool2d(Module):
r"""We provide Dilated Average Pooling for the dilation of Densenet as
in :class:`encoding.dilated.DenseNet`.
Reference::
Reference:
We provide this code for a comming paper.
Applies a 2D average pooling over an input signal composed of several input planes.
In the simplest case, the output value of the layer with input size :math:`(N, C, H, W)`,
output :math:`(N, C, H_{out}, W_{out})` and :attr:`kernel_size` :math:`(kH, kW)`
output :math:`(B, C, H_{out}, W_{out})`, :attr:`kernel_size` :math:`(k_H,k_W)`, :attr:`stride` :math:`(s_H,s_W)` :attr:`dilation` :math:`(d_H,d_W)`
can be precisely described as:
.. math::
\begin{array}{ll}
out(b, c, h, w) = 1 / (kH * kW) *
\sum_{{m}=0}^{kH-1} \sum_{{n}=0}^{kW-1}
input(b, c, dH * h + m, dW * w + n)
out(b, c, h, w) = 1 / (k_H \cdot k_W) \cdot
\sum_{{m}=0}^{k_H-1} \sum_{{n}=0}^{k_W-1}
input(b, c, s_H \cdot h + d_H \cdot m, s_W \cdot w + d_W \cdot n)
\end{array}
| If :attr:`padding` is non-zero, then the input is implicitly zero-padded on both sides
for :attr:`padding` number of points
The parameters :attr:`kernel_size`, :attr:`stride`, :attr:`padding`, :attr:`dilation` can either be:
| The parameters :attr:`kernel_size`, :attr:`stride`, :attr:`padding`, :attr:`dilation` can either be:
- a single ``int`` -- in which case the same value is used for the height and width dimension
- a ``tuple`` of two ints -- in which case, the first `int` is used for the height dimension,
......@@ -235,10 +234,11 @@ class DilatedAvgPool2d(Module):
dilation: the dilation parameter similar to Conv2d
Shape:
- Input: :math:`(N, C, H_{in}, W_{in})`
- Output: :math:`(N, C, H_{out}, W_{out})` where
- Input: :math:`(B, C, H_{in}, W_{in})`
- Output: :math:`(B, C, H_{out}, W_{out})` where
:math:`H_{out} = floor((H_{in} + 2 * padding[0] - kernel\_size[0]) / stride[0] + 1)`
:math:`W_{out} = floor((W_{in} + 2 * padding[1] - kernel\_size[1]) / stride[1] + 1)`
For :attr:`stride=1`, the output featuremap preserves the same size as input.
Examples::
......@@ -306,7 +306,7 @@ class UpsampleConv2d(Module):
(in_channels, scale * scale * out_channels, kernel_size[0], kernel_size[1])
bias (Tensor): the learnable bias of the module of shape (scale * scale * out_channels)
Examples::
Examples:
>>> # With square kernels and equal stride
>>> m = nn.UpsampleCov2d(16, 33, 3, stride=2)
>>> # non-square kernels and unequal stride and with padding
......
......@@ -19,7 +19,7 @@ from torch.nn.parallel.scatter_gather import scatter, scatter_kwargs, \
from torch.nn.parallel.replicate import replicate
from torch.nn.parallel.parallel_apply import parallel_apply
__all__ = ['AllReduce', 'Broadcast', 'ModelDataParallel',
__all__ = ['Reduce', 'AllReduce', 'Broadcast', 'ModelDataParallel',
'CriterionDataParallel', 'SelfDataParallel']
def nccl_all_reduce(inputs):
......@@ -45,6 +45,22 @@ def comm_all_reduce(inputs):
results.append(result.clone().cuda(i))
return results
class Reduce(Function):
def forward(ctx, *inputs):
ctx.save_for_backward(*inputs)
if len(inputs) == 1:
return inputs[0]
return comm.reduce_add(inputs)
def backward(ctx, gradOutput):
inputs = tuple(ctx.saved_tensors)
if len(inputs) == 1:
return gradOutput
gradInputs = []
for i in range(len(inputs)):
with torch.cuda.device_of(inputs[i]):
gradInputs.append(gradOutput.cuda())
return tuple(gradInputs)
class AllReduce(Function):
"""Cross GPU all reduce autograd operation for calculate mean and
......
- [Link to the Deep TEN pre-trained models and experiments](http://hangzh.com/PyTorch-Encoding/experiments/texture.html)
- [Link to the EncNet CIFAR experiments and pre-trained models](http://hangzh.com/PyTorch-Encoding/experiments/cifar.html)
- [Link to the Deep TEN experiments and pre-trained models](http://hangzh.com/PyTorch-Encoding/experiments/texture.html)
......@@ -24,6 +24,8 @@ class Options():
help='number of classes (default: 10)')
parser.add_argument('--widen', type=int, default=4, metavar='N',
help='widen factor of the network (default: 4)')
parser.add_argument('--ncodes', type=int, default=32, metavar='N',
help='number of codewords in Encoding Layer (default: 32)')
parser.add_argument('--backbone', type=str, default='resnet50',
help='backbone name (default: resnet50)')
# training hyper params
......@@ -31,8 +33,8 @@ class Options():
metavar='N', help='batch size for training (default: 128)')
parser.add_argument('--test-batch-size', type=int, default=256,
metavar='N', help='batch size for testing (default: 256)')
parser.add_argument('--epochs', type=int, default=300, metavar='N',
help='number of epochs to train (default: 300)')
parser.add_argument('--epochs', type=int, default=600, metavar='N',
help='number of epochs to train (default: 600)')
parser.add_argument('--start_epoch', type=int, default=1,
metavar='N', help='the epoch number to start (default: 0)')
# lr setting
......@@ -65,4 +67,7 @@ class Options():
self.parser = parser
def parse(self):
return self.parser.parse_args()
args = self.parser.parse_args()
if args.dataset == 'minc':
args.nclass = 23
return args
......@@ -32,7 +32,7 @@ class install(setuptools.command.install.install):
with open(version_path, 'w') as f:
f.write("__version__ = '{}'\n".format(version))
version = '0.1.0'
version = '0.2.0'
try:
sha = subprocess.check_output(['git', 'rev-parse', 'HEAD'],
cwd=cwd).decode('ascii').strip()
......
......@@ -62,19 +62,6 @@ def test_sum_square():
print('Testing sum_square(): {}'.format(test))
def test_dilated_densenet():
net = encoding.dilated.densenet161(True).cuda().eval()
print(net)
net2 = models.densenet161(True).cuda().eval()
x=Variable(torch.Tensor(1,3,224,224).uniform_(-0.5,0.5)).cuda()
y = net.features(x)
y2 = net2.features(x)
print(y[0][0])
print(y2[0][0])
def test_dilated_avgpool():
X = Variable(torch.cuda.FloatTensor(1,3,75,75).uniform_(-0.5,0.5))
input = (X,)
......@@ -89,6 +76,3 @@ if __name__ == '__main__':
test_aggregate()
test_sum_square()
test_dilated_avgpool()
"""
test_dilated_densenet()
"""
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment