Unverified Commit 11bd2eaa authored by Vasilis Vryniotis's avatar Vasilis Vryniotis Committed by GitHub
Browse files

Port Multi-weight support from prototype to main (#5618)



* Moving basefiles outside of prototype and porting Alexnet, ConvNext, Densenet and EfficientNet.

* Porting googlenet

* Porting inception

* Porting mnasnet

* Porting mobilenetv2

* Porting mobilenetv3

* Porting regnet

* Porting resnet

* Porting shufflenetv2

* Porting squeezenet

* Porting vgg

* Porting vit

* Fix docstrings

* Fixing imports

* Adding missing import

* Fix mobilenet imports

* Fix tests

* Fix prototype tests

* Exclude get_weight from models on test

* Fix init files

* Porting googlenet

* Porting inception

* porting mobilenetv2

* porting mobilenetv3

* porting resnet

* porting shufflenetv2

* Fix test and linter

* Fixing docs.

* Porting Detection models (#5617)

* fix inits

* fix docs

* Port faster_rcnn

* Port fcos

* Port keypoint_rcnn

* Port mask_rcnn

* Port retinanet

* Port ssd

* Port ssdlite

* Fix linter

* Fixing tests

* Fixing tests

* Fixing vgg test

* Porting Optical Flow, Segmentation, Video models (#5619)

* Porting raft

* Porting video resnet

* Porting deeplabv3

* Porting fcn and lraspp

* Fixing the tests and linter

* Porting docs, examples, tutorials and galleries (#5620)

* Fix examples, tutorials and gallery

* Update gallery/plot_optical_flow.py
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>

* Fix import

* Revert hardcoded normalization

* fix uncommitted changes

* Fix bug

* Fix more bugs

* Making resize optional for segmentation

* Fixing preset

* Fix mypy

* Fixing documentation strings

* Fix flake8

* minor refactoring
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>

* Resolve conflict

* Porting model tests (#5622)

* Porting tests

* Remove unnecessary variable

* Fix linter

* Move prototype to extended tests

* Fix download models job

* Update CI on Multiweight branch to use the new weight download approach (#5628)

* port Pad to prototype transforms (#5621)

* port Pad to prototype transforms

* use literal

* Bump up LibTorchvision version number for Podspec to release Cocoapods (#5624)
Co-authored-by: default avatarAnton Thomma <anton@pri.co.nz>
Co-authored-by: default avatarVasilis Vryniotis <datumbox@users.noreply.github.com>

* pre-download model weights in CI docs build (#5625)

* pre-download model weights in CI docs build

* move changes into template

* change docs image

* Regenerated config.yml
Co-authored-by: default avatarPhilip Meier <github.pmeier@posteo.de>
Co-authored-by: default avatarAnton Thomma <11010310+thommaa@users.noreply.github.com>
Co-authored-by: default avatarAnton Thomma <anton@pri.co.nz>

* Porting reference scripts and updating presets (#5629)

* Making _preset.py classes

* Remove support of targets on presets.

* Rewriting the video preset

* Adding tests to check that the bundled transforms are JIT scriptable

* Rename all presets from *Eval to *Inference

* Minor refactoring

* Remove --prototype and --pretrained from reference scripts

* remove  pretained_backbone refs

* Corrections and simplifications

* Fixing bug

* Fixing linter

* Fix flake8

* restore documentation example

* minor fixes

* fix optical flow missing param

* Fixing commands

* Adding weights_backbone support in detection and segmentation

* Updating the commands for InceptionV3

* Setting `weights_backbone` to its fully BC value (#5653)

* Replace default `weights_backbone=None` with its BC values.

* Fixing tests

* Fix linter

* Update docs.

* Update preprocessing on reference scripts.

* Change qat/ptq to their full values.

* Refactoring preprocessing

* Fix video preset

* No initialization on VGG if pretrained

* Fix warning messages for backbone utils.

* Adding star to all preset constructors.

* Fix mypy.
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>
Co-authored-by: default avatarPhilip Meier <github.pmeier@posteo.de>
Co-authored-by: default avatarAnton Thomma <11010310+thommaa@users.noreply.github.com>
Co-authored-by: default avatarAnton Thomma <anton@pri.co.nz>
parent 375e4ab2
......@@ -366,19 +366,28 @@ jobs:
resource_class: xlarge
steps:
- checkout
- download_model_weights:
extract_roots: torchvision/prototype/models
- install_torchvision
- install_prototype_dependencies
- pip_install:
args: scipy pycocotools h5py
descr: Install optional dependencies
- run:
name: Enable prototype tests
command: echo 'export PYTORCH_TEST_WITH_PROTOTYPE=1' >> $BASH_ENV
- run_tests_selective:
file_or_dir: test/test_prototype_*.py
unittest_extended:
docker:
- image: circleci/python:3.7
resource_class: xlarge
steps:
- checkout
- download_model_weights
- install_torchvision
- run:
name: Enable extended tests
command: echo 'export PYTORCH_TEST_WITH_EXTENDED=1' >> $BASH_ENV
- run_tests_selective:
file_or_dir: test/test_extended_*.py
binary_linux_wheel:
<<: *binary_common
docker:
......@@ -1629,6 +1638,7 @@ workflows:
- unittest_torchhub
- unittest_onnx
- unittest_prototype
- unittest_extended
- unittest_linux_cpu:
cu_version: cpu
name: unittest_linux_cpu_py3.7
......
......@@ -366,19 +366,28 @@ jobs:
resource_class: xlarge
steps:
- checkout
- download_model_weights:
extract_roots: torchvision/prototype/models
- install_torchvision
- install_prototype_dependencies
- pip_install:
args: scipy pycocotools h5py
descr: Install optional dependencies
- run:
name: Enable prototype tests
command: echo 'export PYTORCH_TEST_WITH_PROTOTYPE=1' >> $BASH_ENV
- run_tests_selective:
file_or_dir: test/test_prototype_*.py
unittest_extended:
docker:
- image: circleci/python:3.7
resource_class: xlarge
steps:
- checkout
- download_model_weights
- install_torchvision
- run:
name: Enable extended tests
command: echo 'export PYTORCH_TEST_WITH_EXTENDED=1' >> $BASH_ENV
- run_tests_selective:
file_or_dir: test/test_extended_*.py
binary_linux_wheel:
<<: *binary_common
docker:
......@@ -1115,6 +1124,7 @@ workflows:
- unittest_torchhub
- unittest_onnx
- unittest_prototype
- unittest_extended
{{ unittest_workflows() }}
cmake:
......
import torch
import torchvision
from torch.utils.mobile_optimizer import optimize_for_mobile
from torchvision.models.detection import (
fasterrcnn_mobilenet_v3_large_320_fpn,
FasterRCNN_MobileNet_V3_Large_320_FPN_Weights,
)
print(torch.__version__)
model = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_320_fpn(
pretrained=True, box_score_thresh=0.7, rpn_post_nms_top_n_test=100, rpn_score_thresh=0.4, rpn_pre_nms_top_n_test=150
model = fasterrcnn_mobilenet_v3_large_320_fpn(
weights=FasterRCNN_MobileNet_V3_Large_320_FPN_Weights.DEFAULT,
box_score_thresh=0.7,
rpn_post_nms_top_n_test=100,
rpn_score_thresh=0.4,
rpn_pre_nms_top_n_test=150,
)
model.eval()
......
......@@ -98,58 +98,6 @@ You can construct a model with random weights by calling its constructor:
convnext_large = models.convnext_large()
We provide pre-trained models, using the PyTorch :mod:`torch.utils.model_zoo`.
These can be constructed by passing ``pretrained=True``:
.. code:: python
import torchvision.models as models
resnet18 = models.resnet18(pretrained=True)
alexnet = models.alexnet(pretrained=True)
squeezenet = models.squeezenet1_0(pretrained=True)
vgg16 = models.vgg16(pretrained=True)
densenet = models.densenet161(pretrained=True)
inception = models.inception_v3(pretrained=True)
googlenet = models.googlenet(pretrained=True)
shufflenet = models.shufflenet_v2_x1_0(pretrained=True)
mobilenet_v2 = models.mobilenet_v2(pretrained=True)
mobilenet_v3_large = models.mobilenet_v3_large(pretrained=True)
mobilenet_v3_small = models.mobilenet_v3_small(pretrained=True)
resnext50_32x4d = models.resnext50_32x4d(pretrained=True)
wide_resnet50_2 = models.wide_resnet50_2(pretrained=True)
mnasnet = models.mnasnet1_0(pretrained=True)
efficientnet_b0 = models.efficientnet_b0(pretrained=True)
efficientnet_b1 = models.efficientnet_b1(pretrained=True)
efficientnet_b2 = models.efficientnet_b2(pretrained=True)
efficientnet_b3 = models.efficientnet_b3(pretrained=True)
efficientnet_b4 = models.efficientnet_b4(pretrained=True)
efficientnet_b5 = models.efficientnet_b5(pretrained=True)
efficientnet_b6 = models.efficientnet_b6(pretrained=True)
efficientnet_b7 = models.efficientnet_b7(pretrained=True)
efficientnet_v2_s = models.efficientnet_v2_s(pretrained=True)
efficientnet_v2_m = models.efficientnet_v2_m(pretrained=True)
efficientnet_v2_l = models.efficientnet_v2_l(pretrained=True)
regnet_y_400mf = models.regnet_y_400mf(pretrained=True)
regnet_y_800mf = models.regnet_y_800mf(pretrained=True)
regnet_y_1_6gf = models.regnet_y_1_6gf(pretrained=True)
regnet_y_3_2gf = models.regnet_y_3_2gf(pretrained=True)
regnet_y_8gf = models.regnet_y_8gf(pretrained=True)
regnet_y_16gf = models.regnet_y_16gf(pretrained=True)
regnet_y_32gf = models.regnet_y_32gf(pretrained=True)
regnet_x_400mf = models.regnet_x_400mf(pretrained=True)
regnet_x_800mf = models.regnet_x_800mf(pretrained=True)
regnet_x_1_6gf = models.regnet_x_1_6gf(pretrained=True)
regnet_x_3_2gf = models.regnet_x_3_2gf(pretrained=True)
regnet_x_8gf = models.regnet_x_8gf(pretrained=True)
regnet_x_16gf = models.regnet_x_16gf(pretrainedTrue)
regnet_x_32gf = models.regnet_x_32gf(pretrained=True)
vit_b_16 = models.vit_b_16(pretrained=True)
vit_b_32 = models.vit_b_32(pretrained=True)
vit_l_16 = models.vit_l_16(pretrained=True)
vit_l_32 = models.vit_l_32(pretrained=True)
convnext_tiny = models.convnext_tiny(pretrained=True)
convnext_small = models.convnext_small(pretrained=True)
convnext_base = models.convnext_base(pretrained=True)
convnext_large = models.convnext_large(pretrained=True)
Instancing a pre-trained model will download its weights to a cache directory.
This directory can be set using the `TORCH_HOME` environment variable. See
......@@ -525,7 +473,7 @@ Obtaining a pre-trained quantized model can be done with a few lines of code:
.. code:: python
import torchvision.models as models
model = models.quantization.mobilenet_v2(pretrained=True, quantize=True)
model = models.quantization.mobilenet_v2(weights=MobileNet_V2_QuantizedWeights.IMAGENET1K_QNNPACK_V1, quantize=True)
model.eval()
# run the model with quantized inputs and weights
out = model(torch.rand(1, 3, 224, 224))
......
......@@ -6,7 +6,7 @@ import torchvision
HERE = osp.dirname(osp.abspath(__file__))
ASSETS = osp.dirname(osp.dirname(HERE))
model = torchvision.models.resnet18(pretrained=False)
model = torchvision.models.resnet18()
model.eval()
traced_model = torch.jit.script(model)
......
......@@ -19,7 +19,6 @@ import numpy as np
import torch
import matplotlib.pyplot as plt
import torchvision.transforms.functional as F
import torchvision.transforms as T
plt.rcParams["savefig.bbox"] = "tight"
......@@ -88,24 +87,19 @@ plot(img1_batch)
# reduce the image sizes for the example to run faster. Image dimension must be
# divisible by 8.
from torchvision.models.optical_flow import Raft_Large_Weights
def preprocess(batch):
transforms = T.Compose(
[
T.ConvertImageDtype(torch.float32),
T.Normalize(mean=0.5, std=0.5), # map [0, 1] into [-1, 1]
T.Resize(size=(520, 960)),
]
)
batch = transforms(batch)
return batch
weights = Raft_Large_Weights.DEFAULT
transforms = weights.transforms()
# If you can, run this example on a GPU, it will be a lot faster.
device = "cuda" if torch.cuda.is_available() else "cpu"
def preprocess(img1_batch, img2_batch):
img1_batch = F.resize(img1_batch, size=[520, 960])
img2_batch = F.resize(img2_batch, size=[520, 960])
return transforms(img1_batch, img2_batch)
img1_batch = preprocess(img1_batch).to(device)
img2_batch = preprocess(img2_batch).to(device)
img1_batch, img2_batch = preprocess(img1_batch, img2_batch)
print(f"shape = {img1_batch.shape}, dtype = {img1_batch.dtype}")
......@@ -121,7 +115,10 @@ print(f"shape = {img1_batch.shape}, dtype = {img1_batch.dtype}")
from torchvision.models.optical_flow import raft_large
model = raft_large(pretrained=True, progress=False).to(device)
# If you can, run this example on a GPU, it will be a lot faster.
device = "cuda" if torch.cuda.is_available() else "cpu"
model = raft_large(weights=Raft_Large_Weights.DEFAULT, progress=False).to(device)
model = model.eval()
list_of_flows = model(img1_batch.to(device), img2_batch.to(device))
......@@ -182,10 +179,9 @@ plot(grid)
# from torchvision.io import write_jpeg
# for i, (img1, img2) in enumerate(zip(frames, frames[1:])):
# # Note: it would be faster to predict batches of flows instead of individual flows
# img1 = preprocess(img1[None]).to(device)
# img2 = preprocess(img2[None]).to(device)
# img1, img2 = preprocess(img1, img2)
# list_of_flows = model(img1_batch, img2_batch)
# list_of_flows = model(img1.to(device), img1.to(device))
# predicted_flow = list_of_flows[-1][0]
# flow_img = flow_to_image(predicted_flow).to("cpu")
# output_folder = "/tmp/" # Update this to the folder of your choice
......
......@@ -139,12 +139,14 @@ show(drawn_boxes)
# Here is demo with a Faster R-CNN model loaded from
# :func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn`
from torchvision.models.detection import fasterrcnn_resnet50_fpn
from torchvision.models.detection import fasterrcnn_resnet50_fpn, FasterRCNN_ResNet50_FPN_Weights
model = fasterrcnn_resnet50_fpn(pretrained=True, progress=False)
weights = FasterRCNN_ResNet50_FPN_Weights.DEFAULT
model = fasterrcnn_resnet50_fpn(weights=weights, progress=False)
print(img.size())
img = F.convert_image_dtype(img, torch.float)
tranforms = weights.transforms()
img = tranforms(img)
target = {}
target["boxes"] = boxes
target["labels"] = labels = torch.ones((masks.size(0),), dtype=torch.int64)
......
......@@ -85,20 +85,16 @@ show([transformed_dog1, transformed_dog2])
# Let's define a ``Predictor`` module that transforms the input tensor and then
# applies an ImageNet model on it.
from torchvision.models import resnet18
from torchvision.models import resnet18, ResNet18_Weights
class Predictor(nn.Module):
def __init__(self):
super().__init__()
self.resnet18 = resnet18(pretrained=True, progress=False).eval()
self.transforms = nn.Sequential(
T.Resize([256, ]), # We use single int value inside a list due to torchscript type restrictions
T.CenterCrop(224),
T.ConvertImageDtype(torch.float),
T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
)
weights = ResNet18_Weights.DEFAULT
self.resnet18 = resnet18(weights=weights, progress=False).eval()
self.transforms = weights.transforms()
def forward(self, x: torch.Tensor) -> torch.Tensor:
with torch.no_grad():
......
......@@ -73,14 +73,17 @@ show(result)
# :func:`~torchvision.models.detection.ssd300_vgg16`. For more details
# on the output of such models, you may refer to :ref:`instance_seg_output`.
from torchvision.models.detection import fasterrcnn_resnet50_fpn
from torchvision.transforms.functional import convert_image_dtype
from torchvision.models.detection import fasterrcnn_resnet50_fpn, FasterRCNN_ResNet50_FPN_Weights
batch_int = torch.stack([dog1_int, dog2_int])
batch = convert_image_dtype(batch_int, dtype=torch.float)
model = fasterrcnn_resnet50_fpn(pretrained=True, progress=False)
weights = FasterRCNN_ResNet50_FPN_Weights.DEFAULT
transforms = weights.transforms()
batch = transforms(batch_int)
model = fasterrcnn_resnet50_fpn(weights=weights, progress=False)
model = model.eval()
outputs = model(batch)
......@@ -120,13 +123,15 @@ show(dogs_with_boxes)
# images must be normalized before they're passed to a semantic segmentation
# model.
from torchvision.models.segmentation import fcn_resnet50
from torchvision.models.segmentation import fcn_resnet50, FCN_ResNet50_Weights
weights = FCN_ResNet50_Weights.DEFAULT
transforms = weights.transforms(resize_size=None)
model = fcn_resnet50(pretrained=True, progress=False)
model = fcn_resnet50(weights=weights, progress=False)
model = model.eval()
normalized_batch = F.normalize(batch, mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
normalized_batch = transforms(batch)
output = model(normalized_batch)['out']
print(output.shape, output.min().item(), output.max().item())
......@@ -262,8 +267,14 @@ show(dogs_with_masks)
# of them may not have masks, like
# :func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn`.
from torchvision.models.detection import maskrcnn_resnet50_fpn
model = maskrcnn_resnet50_fpn(pretrained=True, progress=False)
from torchvision.models.detection import maskrcnn_resnet50_fpn, MaskRCNN_ResNet50_FPN_Weights
weights = MaskRCNN_ResNet50_FPN_Weights.DEFAULT
transforms = weights.transforms()
batch = transforms(batch_int)
model = maskrcnn_resnet50_fpn(weights=weights, progress=False)
model = model.eval()
output = model(batch)
......@@ -378,13 +389,17 @@ show(dogs_with_masks)
# Note that the keypoint detection model does not need normalized images.
#
from torchvision.models.detection import keypointrcnn_resnet50_fpn
from torchvision.models.detection import keypointrcnn_resnet50_fpn, KeypointRCNN_ResNet50_FPN_Weights
from torchvision.io import read_image
person_int = read_image(str(Path("assets") / "person1.jpg"))
person_float = convert_image_dtype(person_int, dtype=torch.float)
model = keypointrcnn_resnet50_fpn(pretrained=True, progress=False)
weights = KeypointRCNN_ResNet50_FPN_Weights.DEFAULT
transforms = weights.transforms()
person_float = transforms(person_int)
model = keypointrcnn_resnet50_fpn(weights=weights, progress=False)
model = model.eval()
outputs = model([person_float])
......
import torch
import torchvision
from torch.utils.mobile_optimizer import optimize_for_mobile
from torchvision.models.detection import (
fasterrcnn_mobilenet_v3_large_320_fpn,
FasterRCNN_MobileNet_V3_Large_320_FPN_Weights,
)
print(torch.__version__)
model = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_320_fpn(
pretrained=True, box_score_thresh=0.7, rpn_post_nms_top_n_test=100, rpn_score_thresh=0.4, rpn_pre_nms_top_n_test=150
model = fasterrcnn_mobilenet_v3_large_320_fpn(
weights=FasterRCNN_MobileNet_V3_Large_320_FPN_Weights.DEFAULT,
box_score_thresh=0.7,
rpn_post_nms_top_n_test=100,
rpn_score_thresh=0.4,
rpn_pre_nms_top_n_test=150,
)
model.eval()
......
......@@ -43,7 +43,7 @@ Since it expects tensors with a size of N x 3 x 299 x 299, to validate the model
```
torchrun --nproc_per_node=8 train.py --model inception_v3\
--val-resize-size 342 --val-crop-size 299 --train-crop-size 299 --test-only --pretrained
--test-only --weights Inception_V3_Weights.IMAGENET1K_V1
```
### ResNet
......@@ -96,22 +96,14 @@ The weights of the B5-B7 variants are ported from Luke Melas' [EfficientNet-PyTo
All models were trained using Bicubic interpolation and each have custom crop and resize sizes. To validate the models use the following commands:
```
torchrun --nproc_per_node=8 train.py --model efficientnet_b0 --interpolation bicubic\
--val-resize-size 256 --val-crop-size 224 --train-crop-size 224 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b1 --interpolation bicubic\
--val-resize-size 256 --val-crop-size 240 --train-crop-size 240 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b2 --interpolation bicubic\
--val-resize-size 288 --val-crop-size 288 --train-crop-size 288 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b3 --interpolation bicubic\
--val-resize-size 320 --val-crop-size 300 --train-crop-size 300 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b4 --interpolation bicubic\
--val-resize-size 384 --val-crop-size 380 --train-crop-size 380 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b5 --interpolation bicubic\
--val-resize-size 456 --val-crop-size 456 --train-crop-size 456 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b6 --interpolation bicubic\
--val-resize-size 528 --val-crop-size 528 --train-crop-size 528 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b7 --interpolation bicubic\
--val-resize-size 600 --val-crop-size 600 --train-crop-size 600 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b0 --test-only --weights EfficientNet_B0_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b1 --test-only --weights EfficientNet_B1_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b2 --test-only --weights EfficientNet_B2_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b3 --test-only --weights EfficientNet_B3_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b4 --test-only --weights EfficientNet_B4_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b5 --test-only --weights EfficientNet_B5_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b6 --test-only --weights EfficientNet_B6_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b7 --test-only --weights EfficientNet_B7_Weights.IMAGENET1K_V1
```
......
......@@ -6,6 +6,7 @@ from torchvision.transforms.functional import InterpolationMode
class ClassificationPresetTrain:
def __init__(
self,
*,
crop_size,
mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225),
......@@ -46,6 +47,7 @@ class ClassificationPresetTrain:
class ClassificationPresetEval:
def __init__(
self,
*,
crop_size,
resize_size=256,
mean=(0.485, 0.456, 0.406),
......
......@@ -15,12 +15,6 @@ from torch.utils.data.dataloader import default_collate
from torchvision.transforms.functional import InterpolationMode
try:
from torchvision import prototype
except ImportError:
prototype = None
def train_one_epoch(model, criterion, optimizer, data_loader, device, epoch, args, model_ema=None, scaler=None):
model.train()
metric_logger = utils.MetricLogger(delimiter=" ")
......@@ -154,18 +148,13 @@ def load_data(traindir, valdir, args):
print(f"Loading dataset_test from {cache_path}")
dataset_test, _ = torch.load(cache_path)
else:
if not args.prototype:
if args.weights and args.test_only:
weights = torchvision.models.get_weight(args.weights)
preprocessing = weights.transforms()
else:
preprocessing = presets.ClassificationPresetEval(
crop_size=val_crop_size, resize_size=val_resize_size, interpolation=interpolation
)
else:
if args.weights:
weights = prototype.models.get_weight(args.weights)
preprocessing = weights.transforms()
else:
preprocessing = prototype.transforms.ImageClassificationEval(
crop_size=val_crop_size, resize_size=val_resize_size, interpolation=interpolation
)
dataset_test = torchvision.datasets.ImageFolder(
valdir,
......@@ -191,10 +180,6 @@ def load_data(traindir, valdir, args):
def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
if args.output_dir:
utils.mkdir(args.output_dir)
......@@ -236,10 +221,7 @@ def main(args):
)
print("Creating model")
if not args.prototype:
model = torchvision.models.__dict__[args.model](pretrained=args.pretrained, num_classes=num_classes)
else:
model = prototype.models.__dict__[args.model](weights=args.weights, num_classes=num_classes)
model = torchvision.models.__dict__[args.model](weights=args.weights, num_classes=num_classes)
model.to(device)
if args.distributed and args.sync_bn:
......@@ -446,12 +428,6 @@ def get_args_parser(add_help=True):
help="Only test the model",
action="store_true",
)
parser.add_argument(
"--pretrained",
dest="pretrained",
help="Use pre-trained models from the modelzoo",
action="store_true",
)
parser.add_argument("--auto-augment", default=None, type=str, help="auto augment policy (default: None)")
parser.add_argument("--random-erase", default=0.0, type=float, help="random erasing probability (default: 0.0)")
......@@ -496,14 +472,6 @@ def get_args_parser(add_help=True):
parser.add_argument(
"--ra-reps", default=3, type=int, help="number of repetitions for Repeated Augmentation (default: 3)"
)
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load")
return parser
......
......@@ -12,17 +12,7 @@ from torch import nn
from train import train_one_epoch, evaluate, load_data
try:
from torchvision import prototype
except ImportError:
prototype = None
def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
if args.output_dir:
utils.mkdir(args.output_dir)
......@@ -56,10 +46,7 @@ def main(args):
print("Creating model", args.model)
# when training quantized models, we always start from a pre-trained fp32 reference model
if not args.prototype:
model = torchvision.models.quantization.__dict__[args.model](pretrained=True, quantize=args.test_only)
else:
model = prototype.models.quantization.__dict__[args.model](weights=args.weights, quantize=args.test_only)
model = torchvision.models.quantization.__dict__[args.model](weights=args.weights, quantize=args.test_only)
model.to(device)
if not (args.test_only or args.post_training_quantize):
......@@ -264,14 +251,6 @@ def get_args_parser(add_help=True):
"--train-crop-size", default=224, type=int, help="the random crop size used for training (default: 224)"
)
parser.add_argument("--clip-grad-norm", default=None, type=float, help="the maximum gradient norm (default None)")
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load")
return parser
......
......@@ -330,22 +330,22 @@ def store_model_weights(model, checkpoint_path, checkpoint_key="model", strict=T
from torchvision import models as M
# Classification
model = M.mobilenet_v3_large(pretrained=False)
model = M.mobilenet_v3_large(weights=None)
print(store_model_weights(model, './class.pth'))
# Quantized Classification
model = M.quantization.mobilenet_v3_large(pretrained=False, quantize=False)
model = M.quantization.mobilenet_v3_large(weights=None, quantize=False)
model.fuse_model(is_qat=True)
model.qconfig = torch.ao.quantization.get_default_qat_qconfig('qnnpack')
_ = torch.ao.quantization.prepare_qat(model, inplace=True)
print(store_model_weights(model, './qat.pth'))
# Object Detection
model = M.detection.fasterrcnn_mobilenet_v3_large_fpn(pretrained=False, pretrained_backbone=False)
model = M.detection.fasterrcnn_mobilenet_v3_large_fpn(weights=None, weights_backbone=None)
print(store_model_weights(model, './obj.pth'))
# Segmentation
model = M.segmentation.deeplabv3_mobilenet_v3_large(pretrained=False, pretrained_backbone=False, aux_loss=True)
model = M.segmentation.deeplabv3_mobilenet_v3_large(weights=None, weights_backbone=None, aux_loss=True)
print(store_model_weights(model, './segm.pth', strict=False))
Args:
......
......@@ -24,35 +24,35 @@ Except otherwise noted, all models have been trained on 8x V100 GPUs.
```
torchrun --nproc_per_node=8 train.py\
--dataset coco --model fasterrcnn_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3
--lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
```
### Faster R-CNN MobileNetV3-Large FPN
```
torchrun --nproc_per_node=8 train.py\
--dataset coco --model fasterrcnn_mobilenet_v3_large_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3
--lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone MobileNet_V3_Large_Weights.IMAGENET1K_V1
```
### Faster R-CNN MobileNetV3-Large 320 FPN
```
torchrun --nproc_per_node=8 train.py\
--dataset coco --model fasterrcnn_mobilenet_v3_large_320_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3
--lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone MobileNet_V3_Large_Weights.IMAGENET1K_V1
```
### FCOS ResNet-50 FPN
```
torchrun --nproc_per_node=8 train.py\
--dataset coco --model fcos_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --amp
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --amp --weights-backbone ResNet50_Weights.IMAGENET1K_V1
```
### RetinaNet
```
torchrun --nproc_per_node=8 train.py\
--dataset coco --model retinanet_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
```
### SSD300 VGG16
......@@ -60,7 +60,7 @@ torchrun --nproc_per_node=8 train.py\
torchrun --nproc_per_node=8 train.py\
--dataset coco --model ssd300_vgg16 --epochs 120\
--lr-steps 80 110 --aspect-ratio-group-factor 3 --lr 0.002 --batch-size 4\
--weight-decay 0.0005 --data-augmentation ssd
--weight-decay 0.0005 --data-augmentation ssd --weights-backbone VGG16_Weights.IMAGENET1K_FEATURES
```
### SSDlite320 MobileNetV3-Large
......@@ -68,7 +68,7 @@ torchrun --nproc_per_node=8 train.py\
torchrun --nproc_per_node=8 train.py\
--dataset coco --model ssdlite320_mobilenet_v3_large --epochs 660\
--aspect-ratio-group-factor 3 --lr-scheduler cosineannealinglr --lr 0.15 --batch-size 24\
--weight-decay 0.00004 --data-augmentation ssdlite
--weight-decay 0.00004 --data-augmentation ssdlite --weights-backbone MobileNet_V3_Large_Weights.IMAGENET1K_V1
```
......@@ -76,7 +76,7 @@ torchrun --nproc_per_node=8 train.py\
```
torchrun --nproc_per_node=8 train.py\
--dataset coco --model maskrcnn_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3
--lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
```
......@@ -84,5 +84,5 @@ torchrun --nproc_per_node=8 train.py\
```
torchrun --nproc_per_node=8 train.py\
--dataset coco_kp --model keypointrcnn_resnet50_fpn --epochs 46\
--lr-steps 36 43 --aspect-ratio-group-factor 3
--lr-steps 36 43 --aspect-ratio-group-factor 3 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
```
......@@ -33,12 +33,6 @@ from engine import train_one_epoch, evaluate
from group_by_aspect_ratio import GroupedBatchSampler, create_aspect_ratio_groups
try:
from torchvision import prototype
except ImportError:
prototype = None
def get_dataset(name, image_set, transform, data_path):
paths = {"coco": (data_path, get_coco, 91), "coco_kp": (data_path, get_coco_kp, 2)}
p, ds_fn, num_classes = paths[name]
......@@ -49,15 +43,13 @@ def get_dataset(name, image_set, transform, data_path):
def get_transform(train, args):
if train:
return presets.DetectionPresetTrain(args.data_augmentation)
elif not args.prototype:
return presets.DetectionPresetEval()
return presets.DetectionPresetTrain(data_augmentation=args.data_augmentation)
elif args.weights and args.test_only:
weights = torchvision.models.get_weight(args.weights)
trans = weights.transforms()
return lambda img, target: (trans(img), target)
else:
if args.weights:
weights = prototype.models.get_weight(args.weights)
return weights.transforms()
else:
return prototype.transforms.ObjectDetectionEval()
return presets.DetectionPresetEval()
def get_args_parser(add_help=True):
......@@ -132,25 +124,12 @@ def get_args_parser(add_help=True):
help="Only test the model",
action="store_true",
)
parser.add_argument(
"--pretrained",
dest="pretrained",
help="Use pre-trained models from the modelzoo",
action="store_true",
)
# distributed training parameters
parser.add_argument("--world-size", default=1, type=int, help="number of distributed processes")
parser.add_argument("--dist-url", default="env://", type=str, help="url used to set up distributed training")
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load")
parser.add_argument("--weights-backbone", default=None, type=str, help="the backbone weights enum name to load")
# Mixed precision training parameters
parser.add_argument("--amp", action="store_true", help="Use torch.cuda.amp for mixed precision training")
......@@ -159,10 +138,6 @@ def get_args_parser(add_help=True):
def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
if args.output_dir:
utils.mkdir(args.output_dir)
......@@ -204,12 +179,9 @@ def main(args):
if "rcnn" in args.model:
if args.rpn_score_thresh is not None:
kwargs["rpn_score_thresh"] = args.rpn_score_thresh
if not args.prototype:
model = torchvision.models.detection.__dict__[args.model](
pretrained=args.pretrained, num_classes=num_classes, **kwargs
)
else:
model = prototype.models.detection.__dict__[args.model](weights=args.weights, num_classes=num_classes, **kwargs)
model = torchvision.models.detection.__dict__[args.model](
weights=args.weights, weights_backbone=args.weights_backbone, num_classes=num_classes, **kwargs
)
model.to(device)
if args.distributed and args.sync_bn:
model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model)
......
......@@ -51,7 +51,7 @@ torchrun --nproc_per_node 8 --nnodes 1 train.py \
### Evaluation
```
torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset sintel --batch-size 1 --dataset-root $dataset_root --model raft_large --pretrained
torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset sintel --batch-size 1 --dataset-root $dataset_root --model raft_large --weights Raft_Large_Weights.C_T_SKHT_V2
```
This should give an epe of about 1.3822 on the clean pass and 2.7161 on the
......@@ -67,6 +67,6 @@ Sintel val final epe: 2.7161 1px: 0.8528 3px: 0.9204 5px: 0.9392 per_image_epe:
You can also evaluate on Kitti train:
```
torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset kitti --batch-size 1 --dataset-root $dataset_root --model raft_large --pretrained
torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset kitti --batch-size 1 --dataset-root $dataset_root --model raft_large --weights Raft_Large_Weights.C_T_SKHT_V2
Kitti val epe: 4.7968 1px: 0.6388 3px: 0.8197 5px: 0.8661 per_image_epe: 4.5118 f1: 16.0679
```
......@@ -22,6 +22,7 @@ class OpticalFlowPresetEval(torch.nn.Module):
class OpticalFlowPresetTrain(torch.nn.Module):
def __init__(
self,
*,
# RandomResizeAndCrop params
crop_size,
min_scale=-0.2,
......
......@@ -9,11 +9,6 @@ import utils
from presets import OpticalFlowPresetTrain, OpticalFlowPresetEval
from torchvision.datasets import KittiFlow, FlyingChairs, FlyingThings3D, Sintel, HD1K
try:
from torchvision import prototype
except ImportError:
prototype = None
def get_train_dataset(stage, dataset_root):
if stage == "chairs":
......@@ -138,12 +133,18 @@ def _evaluate(model, args, val_dataset, *, padder_mode, num_flow_updates=None, b
def evaluate(model, args):
val_datasets = args.val_dataset or []
if args.prototype:
if args.weights:
weights = prototype.models.get_weight(args.weights)
preprocessing = weights.transforms()
else:
preprocessing = prototype.transforms.OpticalFlowEval()
if args.weights and args.test_only:
weights = torchvision.models.get_weight(args.weights)
trans = weights.transforms()
def preprocessing(img1, img2, flow, valid_flow_mask):
img1, img2 = trans(img1, img2)
if flow is not None and not isinstance(flow, torch.Tensor):
flow = torch.from_numpy(flow)
if valid_flow_mask is not None and not isinstance(valid_flow_mask, torch.Tensor):
valid_flow_mask = torch.from_numpy(valid_flow_mask)
return img1, img2, flow, valid_flow_mask
else:
preprocessing = OpticalFlowPresetEval()
......@@ -201,20 +202,14 @@ def train_one_epoch(model, optimizer, scheduler, train_loader, logger, args):
def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
utils.setup_ddp(args)
args.test_only = args.train_dataset is None
if args.distributed and args.device == "cpu":
raise ValueError("The device must be cuda if we want to run in distributed mode using torchrun")
device = torch.device(args.device)
if args.prototype:
model = prototype.models.optical_flow.__dict__[args.model](weights=args.weights)
else:
model = torchvision.models.optical_flow.__dict__[args.model](pretrained=args.pretrained)
model = torchvision.models.optical_flow.__dict__[args.model](weights=args.weights)
if args.distributed:
model = model.to(args.local_rank)
......@@ -228,7 +223,7 @@ def main(args):
checkpoint = torch.load(args.resume, map_location="cpu")
model_without_ddp.load_state_dict(checkpoint["model"])
if args.train_dataset is None:
if args.test_only:
# Set deterministic CUDNN algorithms, since they can affect epe a fair bit.
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
......@@ -356,8 +351,7 @@ def get_args_parser(add_help=True):
parser.add_argument(
"--model", type=str, default="raft_large", help="The name of the model to use - either raft_large or raft_small"
)
# TODO: resume, pretrained, and weights should be in an exclusive arg group
parser.add_argument("--pretrained", action="store_true", help="Whether to use pretrained weights")
# TODO: resume and weights should be in an exclusive arg group
parser.add_argument(
"--num_flow_updates",
......@@ -376,13 +370,6 @@ def get_args_parser(add_help=True):
required=True,
)
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load.")
parser.add_argument("--device", default="cuda", type=str, help="device (Use cuda or cpu, Default: cuda)")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment