Unverified Commit b872eb8c authored by Hang Zhang's avatar Hang Zhang Committed by GitHub
Browse files

ResNeSt plus (#256)

parent 5a1e3fbc
......@@ -4,9 +4,8 @@
name: Upload Python Package
on:
push:
branches:
- master
schedule:
- cron: "0 12 * * *"
jobs:
deploy:
......
......@@ -47,5 +47,5 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
build_dir: "docs/build/html/*"
build_dir: docs/build/html/
target_branch: gh-pages
......@@ -9,16 +9,21 @@ on:
jobs:
docs:
runs-on: ubuntu-latest
runs-on: self-hosted
steps:
- uses: actions/checkout@v2
- uses: seanmiddleditch/gha-setup-ninja@master
- name: Set up Python
uses: actions/setup-python@v1
- name: Set PR Number
uses: actions/github-script@0.3.0
with:
python-version: 3.7
github-token: ${{github.token}}
script: |
const core = require('@actions/core')
const prNumber = context.payload.number;
core.exportVariable('PULL_NUMBER', prNumber);
core.exportVariable("PATH", "/home/ubuntu/anaconda3/bin:/usr/local/bin:/usr/bin/:/bin:$PATH")
- name: Install dependencies
run: |
python -m pip install --upgrade pip
......@@ -39,35 +44,11 @@ jobs:
cd docs/
make html
touch build/html/.nojekyll
- name: Set PR Number
uses: actions/github-script@0.3.0
with:
github-token: ${{github.token}}
script: |
const core = require('@actions/core')
const prNumber = context.payload.number;
core.exportVariable('PULL_NUMBER', prNumber);
# https://github.com/marketplace/actions/github-pages
- name: Deploy
if: success()
uses: jakejarvis/s3-sync-action@master
with:
args: --acl public-read --follow-symlinks --delete
env:
AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_REGION: ${{ secrets.AWS_REGION }}
DEST_DIR: "${{ secrets.DEST_DIR }}/${PULL_NUMBER}"
SOURCE_DIR: 'docs/build/html/'
aws s3 sync build/html/ s3://hangzh/encoding/docs/${{ env.PULL_NUMBER }}/ --acl public-read --follow-symlinks --delete
- name: Comment
if: success()
uses: thollander/actions-comment-pull-request@master
with:
message: "The docs are uploaded and can be previewed at http://${{ secrets.AWS_S3_BUCKET }}.s3.amazonaws.com/${{ secrets.DEST_DIR }}/${{ env.PULL_NUMBER }}/index.html"
message: "The docs are uploaded and can be previewed at http://hangzh.s3.amazonaws.com/encoding/docs/${{ env.PULL_NUMBER }}/index.html"
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
......@@ -8,3 +8,6 @@ docs/src/
docs/html/
encoding/_ext/
encoding.egg-info/
*.o
*.so
*.ninja*
[![PyPI](https://img.shields.io/pypi/v/torch-encoding.svg)](https://pypi.python.org/pypi/torch-encoding)
[![PyPI Pre-release](https://img.shields.io/badge/pypi--prerelease-v1.1.0-ff69b4.svg)](https://pypi.org/project/torch-encoding/#history)
[![PyPI Pre-release](https://img.shields.io/badge/pypi--prerelease-v1.2.0-ff69b4.svg)](https://pypi.org/project/torch-encoding/#history)
[![Upload Python Package](https://github.com/zhanghang1989/PyTorch-Encoding/workflows/Upload%20Python%20Package/badge.svg)](https://github.com/zhanghang1989/PyTorch-Encoding/actions)
[![Downloads](http://pepy.tech/badge/torch-encoding)](http://pepy.tech/project/torch-encoding)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Build Docs](https://github.com/zhanghang1989/PyTorch-Encoding/workflows/Build%20Docs/badge.svg)](https://github.com/zhanghang1989/PyTorch-Encoding/actions)
# PyTorch-Encoding
......@@ -11,10 +12,23 @@ created by [Hang Zhang](http://hangzh.com/)
- Please visit the [**Docs**](http://hangzh.com/PyTorch-Encoding/) for detail instructions of installation and usage.
- Please visit the [link](http://hangzh.com/PyTorch-Encoding/experiments/segmentation.html) to examples of semantic segmentation.
- Please visit the [link](http://hangzh.com/PyTorch-Encoding/model_zoo/imagenet.html) to image classification models.
- Please visit the [link](http://hangzh.com/PyTorch-Encoding/model_zoo/segmentation.html) to semantic segmentation models.
## Citations
**ResNeSt: Split-Attention Networks** [[arXiv]]()
[Hang Zhang](http://hangzh.com/), Chongruo Wu, Zhongyue Zhang, Yi Zhu, Zhi Zhang, Haibin Lin, Yue Sun, Tong He, Jonas Muller, R. Manmatha, Mu Li and Alex Smola
```
@article{zhang2020resnest,
title={ResNeSt: Split-Attention Networks},
author={Zhang, Hang and Wu, Chongruo and Zhang, Zhongyue and Zhu, Yi and Zhang, Zhi and Lin, Haibin and Sun, Yue and He, Tong and Muller, Jonas and Manmatha, R. and Li, Mu and Smola, Alexander},
journal={arXiv preprint},
year={2020}
}
```
**Context Encoding for Semantic Segmentation** [[arXiv]](https://arxiv.org/pdf/1803.08904.pdf)
[Hang Zhang](http://hangzh.com/), [Kristin Dana](http://eceweb1.rutgers.edu/vision/dana.html), [Jianping Shi](http://shijianping.me/), [Zhongyue Zhang](http://zhongyuezhang.com/), [Xiaogang Wang](http://www.ee.cuhk.edu.hk/~xgwang/), [Ambrish Tyagi](https://scholar.google.com/citations?user=GaSWCoUAAAAJ&hl=en), [Amit Agrawal](http://www.amitkagrawal.com/)
```
......
......@@ -13,16 +13,23 @@ An optimized PyTorch package with CUDA backend.
.. toctree::
:glob:
:maxdepth: 1
:caption: Notes
:caption: Installation
notes/*
.. toctree::
:glob:
:maxdepth: 1
:caption: Experiment Systems
:caption: Model Zoo
experiments/*
model_zoo/*
.. toctree::
:glob:
:maxdepth: 1
:caption: Other Tutorials
tutorials/*
.. toctree::
:maxdepth: 1
......@@ -30,7 +37,6 @@ An optimized PyTorch package with CUDA backend.
nn
parallel
models
utils
Indices and tables
......
Image Classification
====================
Install Package
---------------
- Clone the GitHub repo::
git clone https://github.com/zhanghang1989/PyTorch-Encoding
- Install PyTorch Encoding (if not yet). Please follow the installation guide `Installing PyTorch Encoding <../notes/compile.html>`_.
Get Pre-trained Model
---------------------
.. hint::
How to get pretrained model, for example ``ResNeSt50``::
model = encoding.models.get_model('ResNeSt50', pretrained=True)
After clicking ``cmd`` in the table, the command for training the model can be found below the table.
.. role:: raw-html(raw)
:format: html
ResNeSt
~~~~~~~
.. note::
The provided models were trained using MXNet Gluon, this PyTorch implementation is slightly worse than the original implementation.
=============================== ============== ============== =========================================================================================================
Model crop-size Acc Command
=============================== ============== ============== =========================================================================================================
ResNeSt-50 224 81.03 :raw-html:`<a href="javascript:toggleblock('cmd_resnest50')" class="toggleblock">cmd</a>`
ResNeSt-101 256 82.83 :raw-html:`<a href="javascript:toggleblock('cmd_resnest101')" class="toggleblock">cmd</a>`
ResNeSt-200 320 83.84 :raw-html:`<a href="javascript:toggleblock('cmd_resnest200')" class="toggleblock">cmd</a>`
ResNeSt-269 416 84.54 :raw-html:`<a href="javascript:toggleblock('cmd_resnest269')" class="toggleblock">cmd</a>`
=============================== ============== ============== =========================================================================================================
.. raw:: html
<code xml:space="preserve" id="cmd_resnest50" style="display: none; text-align: left; white-space: pre-wrap">
# change the rank for worker node
python train_dist.py --dataset imagenet --model resnest50 --lr-scheduler cos --epochs 270 --checkname resnest50 --lr 0.025 --batch-size 64 --dist-url tcp://MASTER:NODE:IP:ADDRESS:23456 --world-size 4 --label-smoothing 0.1 --mixup 0.2 --no-bn-wd --last-gamma --warmup-epochs 5 --rand-aug --rank 0
</code>
<code xml:space="preserve" id="cmd_resnest101" style="display: none; text-align: left; white-space: pre-wrap">
# change the rank for worker node
python train_dist.py --dataset imagenet --model resnest101 --lr-scheduler cos --epochs 270 --checkname resnest101 --lr 0.025 --batch-size 64 --dist-url tcp://MASTER:NODE:IP:ADDRESS:23456 --world-size 4 --label-smoothing 0.1 --mixup 0.2 --no-bn-wd --last-gamma --warmup-epochs 5 --rand-aug --rank 0
</code>
<code xml:space="preserve" id="cmd_resnest200" style="display: none; text-align: left; white-space: pre-wrap">
# change the rank for worker node
python train_dist.py --dataset imagenet --model resnest200 --lr-scheduler cos --epochs 270 --checkname resnest200 --lr 0.0125 --batch-size 32 --dist-url tcp://MASTER:NODE:IP:ADDRESS:23456 --world-size 8 --label-smoothing 0.1 --mixup 0.2 --no-bn-wd --last-gamma --warmup-epochs 5 --rand-aug --crop-size 256 --rank 0
</code>
<code xml:space="preserve" id="cmd_resnest269" style="display: none; text-align: left; white-space: pre-wrap">
# change the rank for worker node
python train_dist.py --dataset imagenet --model resnest269 --lr-scheduler cos --epochs 270 --checkname resnest269 --lr 0.0125 --batch-size 32 --dist-url tcp://MASTER:NODE:IP:ADDRESS:23456 --world-size 8 --label-smoothing 0.1 --mixup 0.2 --no-bn-wd --last-gamma --warmup-epochs 5 --rand-aug --crop-size 320 --rank 0
</code>
Test Pretrained
~~~~~~~~~~~~~~~
- Prepare the datasets by downloading the data into current folder and then runing the scripts in the ``scripts/`` folder::
python scripts/prepare_imagenet.py --data-dir ./
- The test script is in the ``experiments/recognition/`` folder. For evaluating the model (using MS),
for example ``ResNeSt50``::
python test.py --dataset imagenet --model-zoo ResNeSt50 --crop-size 224 --eval
Train Your Own Model
--------------------
- Prepare the datasets by downloading the data into current folder and then runing the scripts in the ``scripts/`` folder::
python scripts/prepare_imagenet.py --data-dir ./
- The training script is in the ``experiments/recognition/`` folder. Commands for reproducing pre-trained models can be found in the table.
Context Encoding for Semantic Segmentation (EncNet)
===================================================
Semantic Segmentation
=====================
Install Package
---------------
......@@ -29,31 +29,52 @@ Get Pre-trained Model
:format: html
.. tabularcolumns:: |>{\centering\arraybackslash}\X{4}{5}|>{\raggedleft\arraybackslash}\X{1}{5}|
ResNeSt Backbone Models
-----------------------
============================================================================== ============== ============== =============================================================================================
============================================================================== ============== ============== =========================================================================================================
Model pixAcc mIoU Command
============================================================================== ============== ============== =============================================================================================
Encnet_ResNet50_PContext 79.2% 51.0% :raw-html:`<a href="javascript:toggleblock('cmd_enc50_pcont')" class="toggleblock">cmd</a>`
EncNet_ResNet101_PContext 80.7% 54.1% :raw-html:`<a href="javascript:toggleblock('cmd_enc101_pcont')" class="toggleblock">cmd</a>`
EncNet_ResNet50_ADE 80.1% 41.5% :raw-html:`<a href="javascript:toggleblock('cmd_enc50_ade')" class="toggleblock">cmd</a>`
EncNet_ResNet101_ADE 81.3% 44.4% :raw-html:`<a href="javascript:toggleblock('cmd_enc101_ade')" class="toggleblock">cmd</a>`
EncNet_ResNet101_VOC N/A 85.9% :raw-html:`<a href="javascript:toggleblock('cmd_enc101_voc')" class="toggleblock">cmd</a>`
============================================================================== ============== ============== =============================================================================================
============================================================================== ============== ============== =========================================================================================================
FCN_ResNeSt50_ADE xx.xx% xx.xx% :raw-html:`<a href="javascript:toggleblock('cmd_fcn_nest50_ade')" class="toggleblock">cmd</a>`
DeepLabV3_ResNeSt50_ADE 81.17% 45.12% :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest50_ade')" class="toggleblock">cmd</a>`
DeepLabV3_ResNeSt101_ADE 82.07% 46.91% :raw-html:`<a href="javascript:toggleblock('cmd_deeplab_resnest101_ade')" class="toggleblock">cmd</a>`
============================================================================== ============== ============== =========================================================================================================
.. raw:: html
<code xml:space="preserve" id="cmd_fcn50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model FCN
<code xml:space="preserve" id="cmd_fcn_nest50_ade" style="display: none; text-align: left; white-space: pre-wrap">
python train.py --dataset ade20k --model fcn --aux --backbone resnest50 --batch-size 2
</code>
<code xml:space="preserve" id="cmd_enc50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss
<code xml:space="preserve" id="cmd_deeplab_resnest50_ade" style="display: none; text-align: left; white-space: pre-wrap">
python train.py --dataset ADE20K --model deeplab --aux --backbone resnest50
</code>
<code xml:space="preserve" id="cmd_enc101_pcont" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss --backbone resnet101
<code xml:space="preserve" id="cmd_deeplab_resnest101_ade" style="display: none; text-align: left; white-space: pre-wrap">
python train.py --dataset ADE20K --model deeplab --aux --backbone resnest101
</code>
ResNet Backbone Models
----------------------
ADE20K Dataset
~~~~~~~~~~~~~~
============================================================================== ================= ============== =============================================================================================
Model pixAcc mIoU Command
============================================================================== ================= ============== =============================================================================================
FCN_ResNet50_ADE 78.7% 38.5% :raw-html:`<a href="javascript:toggleblock('cmd_fcn50_ade')" class="toggleblock">cmd</a>`
EncNet_ResNet50_ADE 80.1% 41.5% :raw-html:`<a href="javascript:toggleblock('cmd_enc50_ade')" class="toggleblock">cmd</a>`
EncNet_ResNet101_ADE 81.3% 44.4% :raw-html:`<a href="javascript:toggleblock('cmd_enc101_ade')" class="toggleblock">cmd</a>`
EncNet_ResNet101_VOC N/A 85.9% :raw-html:`<a href="javascript:toggleblock('cmd_enc101_voc')" class="toggleblock">cmd</a>`
============================================================================== ================= ============== =============================================================================================
.. raw:: html
<code xml:space="preserve" id="cmd_fcn50_ade" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model FCN
</code>
<code xml:space="preserve" id="cmd_psp50_ade" style="display: none; text-align: left; white-space: pre-wrap">
......@@ -64,7 +85,6 @@ EncNet_ResNet101_VOC
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model EncNet --aux --se-loss
</code>
<code xml:space="preserve" id="cmd_enc101_ade" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ADE20K --model EncNet --aux --se-loss --backbone resnet101 --base-size 640 --crop-size 576
</code>
......@@ -77,6 +97,33 @@ EncNet_ResNet101_VOC
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset Pascal_voc --model encnet --aux --se-loss --backbone resnet101 --lr 0.0001 --syncbn --ngpus 4 --checkname res101 --resume runs/Pascal_aug/encnet/res101/checkpoint.params --ft
</code>
Pascal Context Dataset
~~~~~~~~~~~~~~~~~~~~~~
============================================================================== ================= ============== =============================================================================================
Model pixAcc mIoU Command
============================================================================== ================= ============== =============================================================================================
Encnet_ResNet50_PContext 79.2% 51.0% :raw-html:`<a href="javascript:toggleblock('cmd_enc50_pcont')" class="toggleblock">cmd</a>`
EncNet_ResNet101_PContext 80.7% 54.1% :raw-html:`<a href="javascript:toggleblock('cmd_enc101_pcont')" class="toggleblock">cmd</a>`
============================================================================== ================= ============== =============================================================================================
.. raw:: html
<code xml:space="preserve" id="cmd_fcn50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model FCN
</code>
<code xml:space="preserve" id="cmd_enc50_pcont" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss
</code>
<code xml:space="preserve" id="cmd_enc101_pcont" style="display: none; text-align: left; white-space: pre-wrap">
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset PContext --model EncNet --aux --se-loss --backbone resnet101
</code>
Test Pretrained
~~~~~~~~~~~~~~~
......@@ -127,13 +174,13 @@ Quick Demo
Train Your Own Model
--------------------
- Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``PASCAL Context`` dataset::
- Prepare the datasets by runing the scripts in the ``scripts/`` folder, for example preparing ``ADE20K`` dataset::
python scripts/prepare_pcontext.py
python scripts/prepare_ade20k.py
- The training script is in the ``experiments/segmentation/`` folder, example training command::
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset pcontext --model encnet --aux --se-loss
python train_dist.py --dataset ade20k --model encnet --aux --se-loss
- Detail training options, please run ``python train.py -h``. Commands for reproducing pre-trained models can be found in the table.
......@@ -142,7 +189,7 @@ Train Your Own Model
training correctness purpose. For evaluating the pretrained model on validation set using MS,
please use the command::
CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset pcontext --model encnet --aux --se-loss --resume mycheckpoint --eval
python test.py --dataset pcontext --model encnet --aux --se-loss --resume mycheckpoint --eval
Citation
--------
......
.. role:: hidden
:class: hidden-section
encoding.models
================
.. automodule:: encoding.models.resnet
.. currentmodule:: encoding.models.resnet
ResNet
------
We provide correct dilated pre-trained ResNet and DenseNet (stride of 8) for semantic segmentation.
For dilation of DenseNet, we provide :class:`encoding.nn.DilatedAvgPool2d`.
All provided models have been verified.
.. note::
This code is provided together with the paper
* Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. "Context Encoding for Semantic Segmentation" *The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018*
:hidden:`ResNet`
~~~~~~~~~~~~~~~~
.. autoclass:: ResNet
:members:
:hidden:`resnet18`
~~~~~~~~~~~~~~~~~~
.. autofunction:: resnet18
:hidden:`resnet34`
~~~~~~~~~~~~~~~~~~
.. autofunction:: resnet34
:hidden:`resnet50`
~~~~~~~~~~~~~~~~~~
.. autofunction:: resnet50
:hidden:`resnet101`
~~~~~~~~~~~~~~~~~~~
.. autofunction:: resnet101
:hidden:`resnet152`
~~~~~~~~~~~~~~~~~~~
.. autofunction:: resnet152
......@@ -14,6 +14,12 @@ Customized NN modules in Encoding Package. For Synchronized Cross-GPU Batch Norm
.. autoclass:: Encoding
:members:
:hidden:`DistSyncBatchNorm`
~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: DistSyncBatchNorm
:members:
:hidden:`SyncBatchNorm`
~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -5,17 +5,41 @@ Install and Citations
Installation
------------
* Install PyTorch 1.0 by following the `PyTorch instructions <http://pytorch.org/>`_.
* Install PyTorch 1.4.0 by following the `PyTorch instructions <http://pytorch.org/>`_.
* PIP Install::
pip install torch-encoding
pip install torch-encoding --pre
* Install from source::
git clone https://github.com/zhanghang1989/PyTorch-Encoding && cd PyTorch-Encoding
python setup.py install
Detailed Steps
--------------
This tutorial is a sucessful setup example for AWS EC2 p3 instance with ubuntu 16.04, CUDA 10.
We cannot guarantee it to work for all the machines, but the steps should be similar.
Assuming CUDA and cudnn are already sucessfully installed, otherwise please refer to other tutorials.
* Install Anaconda from the `link <https://www.anaconda.com/distribution/>`_ .
* Install ninja::
wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
sudo unzip ninja-linux.zip -d /usr/local/bin/
sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force
* Install PyTorch::
conda install pytorch torchvision cudatoolkit=100 -c pytorch
* Install this package::
pip install torch-encoding --pre
Citations
---------
......
......@@ -7,10 +7,7 @@ encoding.parallel
- Current PyTorch DataParallel Table is not supporting mutl-gpu loss calculation, which makes the gpu memory usage very in-balance. We address this issue here by doing DataParallel for Model & Criterion.
.. note::
This code is provided together with the paper
* Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. "Context Encoding for Semantic Segmentation" *The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018*
Deprecated, please use torch.nn.parallel.DistributedDataParallel with :class:`encoding.nn.DistSyncBatchNorm` for the best performance.
.. automodule:: encoding.parallel
.. currentmodule:: encoding.parallel
......
......@@ -20,6 +20,12 @@ Useful util functions.
.. autofunction:: save_checkpoint
:hidden:`SegmentationMetric`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: SegmentationMetric
:members:
:hidden:`batch_pix_accuracy`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -57,38 +57,43 @@ class ADE20KSegmentation(BaseDataset):
mask = self.target_transform(mask)
return img, mask
def _sync_transform(self, img, mask):
# random mirror
if random.random() < 0.5:
img = img.transpose(Image.FLIP_LEFT_RIGHT)
mask = mask.transpose(Image.FLIP_LEFT_RIGHT)
crop_size = self.crop_size
w, h = img.size
long_size = random.randint(int(self.base_size*0.5), int(self.base_size*2.5))
if h > w:
oh = long_size
ow = int(1.0 * w * long_size / h + 0.5)
short_size = ow
else:
ow = long_size
oh = int(1.0 * h * long_size / w + 0.5)
short_size = oh
img = img.resize((ow, oh), Image.BILINEAR)
mask = mask.resize((ow, oh), Image.NEAREST)
# pad crop
if short_size < crop_size:
padh = crop_size - oh if oh < crop_size else 0
padw = crop_size - ow if ow < crop_size else 0
img = ImageOps.expand(img, border=(0, 0, padw, padh), fill=0)
mask = ImageOps.expand(mask, border=(0, 0, padw, padh), fill=0)
# random crop crop_size
w, h = img.size
x1 = random.randint(0, w - crop_size)
y1 = random.randint(0, h - crop_size)
img = img.crop((x1, y1, x1+crop_size, y1+crop_size))
mask = mask.crop((x1, y1, x1+crop_size, y1+crop_size))
# final transform
return img, self._mask_transform(mask)
#def _sync_transform(self, img, mask):
# # random mirror
# if random.random() < 0.5:
# img = img.transpose(Image.FLIP_LEFT_RIGHT)
# mask = mask.transpose(Image.FLIP_LEFT_RIGHT)
# crop_size = self.crop_size
# # random scale (short edge)
# w, h = img.size
# long_size = random.randint(int(self.base_size*0.5), int(self.base_size*2.0))
# if h > w:
# oh = long_size
# ow = int(1.0 * w * long_size / h + 0.5)
# short_size = ow
# else:
# ow = long_size
# oh = int(1.0 * h * long_size / w + 0.5)
# short_size = oh
# img = img.resize((ow, oh), Image.BILINEAR)
# mask = mask.resize((ow, oh), Image.NEAREST)
# # pad crop
# if short_size < crop_size:
# padh = crop_size - oh if oh < crop_size else 0
# padw = crop_size - ow if ow < crop_size else 0
# img = ImageOps.expand(img, border=(0, 0, padw, padh), fill=0)
# mask = ImageOps.expand(mask, border=(0, 0, padw, padh), fill=0)
# # random crop crop_size
# w, h = img.size
# x1 = random.randint(0, w - crop_size)
# y1 = random.randint(0, h - crop_size)
# img = img.crop((x1, y1, x1+crop_size, y1+crop_size))
# mask = mask.crop((x1, y1, x1+crop_size, y1+crop_size))
# # gaussian blur as in PSP
# if random.random() < 0.5:
# img = img.filter(ImageFilter.GaussianBlur(
# radius=random.random()))
# # final transform
# return img, self._mask_transform(mask)
def _mask_transform(self, mask):
target = np.array(mask).astype('int64') - 1
......
......@@ -67,6 +67,7 @@ class BaseDataset(data.Dataset):
img = img.transpose(Image.FLIP_LEFT_RIGHT)
mask = mask.transpose(Image.FLIP_LEFT_RIGHT)
crop_size = self.crop_size
# random scale (short edge)
w, h = img.size
long_size = random.randint(int(self.base_size*0.5), int(self.base_size*2.0))
if h > w:
......
......@@ -19,7 +19,7 @@ from .base import BaseDataset
class CitySegmentation(BaseDataset):
NUM_CLASS = 19
def __init__(self, root=os.path.expanduser('~/.encoding/data'), split='train',
def __init__(self, root=os.path.expanduser('~/.encoding/data/citys/'), split='train',
mode=None, transform=None, target_transform=None, **kwargs):
super(CitySegmentation, self).__init__(
root, split, mode, transform, target_transform, **kwargs)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment