Unverified Commit 11bd2eaa authored by Vasilis Vryniotis's avatar Vasilis Vryniotis Committed by GitHub
Browse files

Port Multi-weight support from prototype to main (#5618)



* Moving basefiles outside of prototype and porting Alexnet, ConvNext, Densenet and EfficientNet.

* Porting googlenet

* Porting inception

* Porting mnasnet

* Porting mobilenetv2

* Porting mobilenetv3

* Porting regnet

* Porting resnet

* Porting shufflenetv2

* Porting squeezenet

* Porting vgg

* Porting vit

* Fix docstrings

* Fixing imports

* Adding missing import

* Fix mobilenet imports

* Fix tests

* Fix prototype tests

* Exclude get_weight from models on test

* Fix init files

* Porting googlenet

* Porting inception

* porting mobilenetv2

* porting mobilenetv3

* porting resnet

* porting shufflenetv2

* Fix test and linter

* Fixing docs.

* Porting Detection models (#5617)

* fix inits

* fix docs

* Port faster_rcnn

* Port fcos

* Port keypoint_rcnn

* Port mask_rcnn

* Port retinanet

* Port ssd

* Port ssdlite

* Fix linter

* Fixing tests

* Fixing tests

* Fixing vgg test

* Porting Optical Flow, Segmentation, Video models (#5619)

* Porting raft

* Porting video resnet

* Porting deeplabv3

* Porting fcn and lraspp

* Fixing the tests and linter

* Porting docs, examples, tutorials and galleries (#5620)

* Fix examples, tutorials and gallery

* Update gallery/plot_optical_flow.py
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>

* Fix import

* Revert hardcoded normalization

* fix uncommitted changes

* Fix bug

* Fix more bugs

* Making resize optional for segmentation

* Fixing preset

* Fix mypy

* Fixing documentation strings

* Fix flake8

* minor refactoring
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>

* Resolve conflict

* Porting model tests (#5622)

* Porting tests

* Remove unnecessary variable

* Fix linter

* Move prototype to extended tests

* Fix download models job

* Update CI on Multiweight branch to use the new weight download approach (#5628)

* port Pad to prototype transforms (#5621)

* port Pad to prototype transforms

* use literal

* Bump up LibTorchvision version number for Podspec to release Cocoapods (#5624)
Co-authored-by: default avatarAnton Thomma <anton@pri.co.nz>
Co-authored-by: default avatarVasilis Vryniotis <datumbox@users.noreply.github.com>

* pre-download model weights in CI docs build (#5625)

* pre-download model weights in CI docs build

* move changes into template

* change docs image

* Regenerated config.yml
Co-authored-by: default avatarPhilip Meier <github.pmeier@posteo.de>
Co-authored-by: default avatarAnton Thomma <11010310+thommaa@users.noreply.github.com>
Co-authored-by: default avatarAnton Thomma <anton@pri.co.nz>

* Porting reference scripts and updating presets (#5629)

* Making _preset.py classes

* Remove support of targets on presets.

* Rewriting the video preset

* Adding tests to check that the bundled transforms are JIT scriptable

* Rename all presets from *Eval to *Inference

* Minor refactoring

* Remove --prototype and --pretrained from reference scripts

* remove  pretained_backbone refs

* Corrections and simplifications

* Fixing bug

* Fixing linter

* Fix flake8

* restore documentation example

* minor fixes

* fix optical flow missing param

* Fixing commands

* Adding weights_backbone support in detection and segmentation

* Updating the commands for InceptionV3

* Setting `weights_backbone` to its fully BC value (#5653)

* Replace default `weights_backbone=None` with its BC values.

* Fixing tests

* Fix linter

* Update docs.

* Update preprocessing on reference scripts.

* Change qat/ptq to their full values.

* Refactoring preprocessing

* Fix video preset

* No initialization on VGG if pretrained

* Fix warning messages for backbone utils.

* Adding star to all preset constructors.

* Fix mypy.
Co-authored-by: default avatarNicolas Hug <contact@nicolas-hug.com>
Co-authored-by: default avatarPhilip Meier <github.pmeier@posteo.de>
Co-authored-by: default avatarAnton Thomma <11010310+thommaa@users.noreply.github.com>
Co-authored-by: default avatarAnton Thomma <anton@pri.co.nz>
parent 375e4ab2
...@@ -366,19 +366,28 @@ jobs: ...@@ -366,19 +366,28 @@ jobs:
resource_class: xlarge resource_class: xlarge
steps: steps:
- checkout - checkout
- download_model_weights:
extract_roots: torchvision/prototype/models
- install_torchvision - install_torchvision
- install_prototype_dependencies - install_prototype_dependencies
- pip_install: - pip_install:
args: scipy pycocotools h5py args: scipy pycocotools h5py
descr: Install optional dependencies descr: Install optional dependencies
- run:
name: Enable prototype tests
command: echo 'export PYTORCH_TEST_WITH_PROTOTYPE=1' >> $BASH_ENV
- run_tests_selective: - run_tests_selective:
file_or_dir: test/test_prototype_*.py file_or_dir: test/test_prototype_*.py
unittest_extended:
docker:
- image: circleci/python:3.7
resource_class: xlarge
steps:
- checkout
- download_model_weights
- install_torchvision
- run:
name: Enable extended tests
command: echo 'export PYTORCH_TEST_WITH_EXTENDED=1' >> $BASH_ENV
- run_tests_selective:
file_or_dir: test/test_extended_*.py
binary_linux_wheel: binary_linux_wheel:
<<: *binary_common <<: *binary_common
docker: docker:
...@@ -1629,6 +1638,7 @@ workflows: ...@@ -1629,6 +1638,7 @@ workflows:
- unittest_torchhub - unittest_torchhub
- unittest_onnx - unittest_onnx
- unittest_prototype - unittest_prototype
- unittest_extended
- unittest_linux_cpu: - unittest_linux_cpu:
cu_version: cpu cu_version: cpu
name: unittest_linux_cpu_py3.7 name: unittest_linux_cpu_py3.7
......
...@@ -366,19 +366,28 @@ jobs: ...@@ -366,19 +366,28 @@ jobs:
resource_class: xlarge resource_class: xlarge
steps: steps:
- checkout - checkout
- download_model_weights:
extract_roots: torchvision/prototype/models
- install_torchvision - install_torchvision
- install_prototype_dependencies - install_prototype_dependencies
- pip_install: - pip_install:
args: scipy pycocotools h5py args: scipy pycocotools h5py
descr: Install optional dependencies descr: Install optional dependencies
- run:
name: Enable prototype tests
command: echo 'export PYTORCH_TEST_WITH_PROTOTYPE=1' >> $BASH_ENV
- run_tests_selective: - run_tests_selective:
file_or_dir: test/test_prototype_*.py file_or_dir: test/test_prototype_*.py
unittest_extended:
docker:
- image: circleci/python:3.7
resource_class: xlarge
steps:
- checkout
- download_model_weights
- install_torchvision
- run:
name: Enable extended tests
command: echo 'export PYTORCH_TEST_WITH_EXTENDED=1' >> $BASH_ENV
- run_tests_selective:
file_or_dir: test/test_extended_*.py
binary_linux_wheel: binary_linux_wheel:
<<: *binary_common <<: *binary_common
docker: docker:
...@@ -1115,6 +1124,7 @@ workflows: ...@@ -1115,6 +1124,7 @@ workflows:
- unittest_torchhub - unittest_torchhub
- unittest_onnx - unittest_onnx
- unittest_prototype - unittest_prototype
- unittest_extended
{{ unittest_workflows() }} {{ unittest_workflows() }}
cmake: cmake:
......
import torch import torch
import torchvision
from torch.utils.mobile_optimizer import optimize_for_mobile from torch.utils.mobile_optimizer import optimize_for_mobile
from torchvision.models.detection import (
fasterrcnn_mobilenet_v3_large_320_fpn,
FasterRCNN_MobileNet_V3_Large_320_FPN_Weights,
)
print(torch.__version__) print(torch.__version__)
model = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_320_fpn( model = fasterrcnn_mobilenet_v3_large_320_fpn(
pretrained=True, box_score_thresh=0.7, rpn_post_nms_top_n_test=100, rpn_score_thresh=0.4, rpn_pre_nms_top_n_test=150 weights=FasterRCNN_MobileNet_V3_Large_320_FPN_Weights.DEFAULT,
box_score_thresh=0.7,
rpn_post_nms_top_n_test=100,
rpn_score_thresh=0.4,
rpn_pre_nms_top_n_test=150,
) )
model.eval() model.eval()
......
...@@ -98,58 +98,6 @@ You can construct a model with random weights by calling its constructor: ...@@ -98,58 +98,6 @@ You can construct a model with random weights by calling its constructor:
convnext_large = models.convnext_large() convnext_large = models.convnext_large()
We provide pre-trained models, using the PyTorch :mod:`torch.utils.model_zoo`. We provide pre-trained models, using the PyTorch :mod:`torch.utils.model_zoo`.
These can be constructed by passing ``pretrained=True``:
.. code:: python
import torchvision.models as models
resnet18 = models.resnet18(pretrained=True)
alexnet = models.alexnet(pretrained=True)
squeezenet = models.squeezenet1_0(pretrained=True)
vgg16 = models.vgg16(pretrained=True)
densenet = models.densenet161(pretrained=True)
inception = models.inception_v3(pretrained=True)
googlenet = models.googlenet(pretrained=True)
shufflenet = models.shufflenet_v2_x1_0(pretrained=True)
mobilenet_v2 = models.mobilenet_v2(pretrained=True)
mobilenet_v3_large = models.mobilenet_v3_large(pretrained=True)
mobilenet_v3_small = models.mobilenet_v3_small(pretrained=True)
resnext50_32x4d = models.resnext50_32x4d(pretrained=True)
wide_resnet50_2 = models.wide_resnet50_2(pretrained=True)
mnasnet = models.mnasnet1_0(pretrained=True)
efficientnet_b0 = models.efficientnet_b0(pretrained=True)
efficientnet_b1 = models.efficientnet_b1(pretrained=True)
efficientnet_b2 = models.efficientnet_b2(pretrained=True)
efficientnet_b3 = models.efficientnet_b3(pretrained=True)
efficientnet_b4 = models.efficientnet_b4(pretrained=True)
efficientnet_b5 = models.efficientnet_b5(pretrained=True)
efficientnet_b6 = models.efficientnet_b6(pretrained=True)
efficientnet_b7 = models.efficientnet_b7(pretrained=True)
efficientnet_v2_s = models.efficientnet_v2_s(pretrained=True)
efficientnet_v2_m = models.efficientnet_v2_m(pretrained=True)
efficientnet_v2_l = models.efficientnet_v2_l(pretrained=True)
regnet_y_400mf = models.regnet_y_400mf(pretrained=True)
regnet_y_800mf = models.regnet_y_800mf(pretrained=True)
regnet_y_1_6gf = models.regnet_y_1_6gf(pretrained=True)
regnet_y_3_2gf = models.regnet_y_3_2gf(pretrained=True)
regnet_y_8gf = models.regnet_y_8gf(pretrained=True)
regnet_y_16gf = models.regnet_y_16gf(pretrained=True)
regnet_y_32gf = models.regnet_y_32gf(pretrained=True)
regnet_x_400mf = models.regnet_x_400mf(pretrained=True)
regnet_x_800mf = models.regnet_x_800mf(pretrained=True)
regnet_x_1_6gf = models.regnet_x_1_6gf(pretrained=True)
regnet_x_3_2gf = models.regnet_x_3_2gf(pretrained=True)
regnet_x_8gf = models.regnet_x_8gf(pretrained=True)
regnet_x_16gf = models.regnet_x_16gf(pretrainedTrue)
regnet_x_32gf = models.regnet_x_32gf(pretrained=True)
vit_b_16 = models.vit_b_16(pretrained=True)
vit_b_32 = models.vit_b_32(pretrained=True)
vit_l_16 = models.vit_l_16(pretrained=True)
vit_l_32 = models.vit_l_32(pretrained=True)
convnext_tiny = models.convnext_tiny(pretrained=True)
convnext_small = models.convnext_small(pretrained=True)
convnext_base = models.convnext_base(pretrained=True)
convnext_large = models.convnext_large(pretrained=True)
Instancing a pre-trained model will download its weights to a cache directory. Instancing a pre-trained model will download its weights to a cache directory.
This directory can be set using the `TORCH_HOME` environment variable. See This directory can be set using the `TORCH_HOME` environment variable. See
...@@ -525,7 +473,7 @@ Obtaining a pre-trained quantized model can be done with a few lines of code: ...@@ -525,7 +473,7 @@ Obtaining a pre-trained quantized model can be done with a few lines of code:
.. code:: python .. code:: python
import torchvision.models as models import torchvision.models as models
model = models.quantization.mobilenet_v2(pretrained=True, quantize=True) model = models.quantization.mobilenet_v2(weights=MobileNet_V2_QuantizedWeights.IMAGENET1K_QNNPACK_V1, quantize=True)
model.eval() model.eval()
# run the model with quantized inputs and weights # run the model with quantized inputs and weights
out = model(torch.rand(1, 3, 224, 224)) out = model(torch.rand(1, 3, 224, 224))
......
...@@ -6,7 +6,7 @@ import torchvision ...@@ -6,7 +6,7 @@ import torchvision
HERE = osp.dirname(osp.abspath(__file__)) HERE = osp.dirname(osp.abspath(__file__))
ASSETS = osp.dirname(osp.dirname(HERE)) ASSETS = osp.dirname(osp.dirname(HERE))
model = torchvision.models.resnet18(pretrained=False) model = torchvision.models.resnet18()
model.eval() model.eval()
traced_model = torch.jit.script(model) traced_model = torch.jit.script(model)
......
...@@ -19,7 +19,6 @@ import numpy as np ...@@ -19,7 +19,6 @@ import numpy as np
import torch import torch
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import torchvision.transforms.functional as F import torchvision.transforms.functional as F
import torchvision.transforms as T
plt.rcParams["savefig.bbox"] = "tight" plt.rcParams["savefig.bbox"] = "tight"
...@@ -88,24 +87,19 @@ plot(img1_batch) ...@@ -88,24 +87,19 @@ plot(img1_batch)
# reduce the image sizes for the example to run faster. Image dimension must be # reduce the image sizes for the example to run faster. Image dimension must be
# divisible by 8. # divisible by 8.
from torchvision.models.optical_flow import Raft_Large_Weights
def preprocess(batch): weights = Raft_Large_Weights.DEFAULT
transforms = T.Compose( transforms = weights.transforms()
[
T.ConvertImageDtype(torch.float32),
T.Normalize(mean=0.5, std=0.5), # map [0, 1] into [-1, 1]
T.Resize(size=(520, 960)),
]
)
batch = transforms(batch)
return batch
# If you can, run this example on a GPU, it will be a lot faster. def preprocess(img1_batch, img2_batch):
device = "cuda" if torch.cuda.is_available() else "cpu" img1_batch = F.resize(img1_batch, size=[520, 960])
img2_batch = F.resize(img2_batch, size=[520, 960])
return transforms(img1_batch, img2_batch)
img1_batch = preprocess(img1_batch).to(device) img1_batch, img2_batch = preprocess(img1_batch, img2_batch)
img2_batch = preprocess(img2_batch).to(device)
print(f"shape = {img1_batch.shape}, dtype = {img1_batch.dtype}") print(f"shape = {img1_batch.shape}, dtype = {img1_batch.dtype}")
...@@ -121,7 +115,10 @@ print(f"shape = {img1_batch.shape}, dtype = {img1_batch.dtype}") ...@@ -121,7 +115,10 @@ print(f"shape = {img1_batch.shape}, dtype = {img1_batch.dtype}")
from torchvision.models.optical_flow import raft_large from torchvision.models.optical_flow import raft_large
model = raft_large(pretrained=True, progress=False).to(device) # If you can, run this example on a GPU, it will be a lot faster.
device = "cuda" if torch.cuda.is_available() else "cpu"
model = raft_large(weights=Raft_Large_Weights.DEFAULT, progress=False).to(device)
model = model.eval() model = model.eval()
list_of_flows = model(img1_batch.to(device), img2_batch.to(device)) list_of_flows = model(img1_batch.to(device), img2_batch.to(device))
...@@ -182,10 +179,9 @@ plot(grid) ...@@ -182,10 +179,9 @@ plot(grid)
# from torchvision.io import write_jpeg # from torchvision.io import write_jpeg
# for i, (img1, img2) in enumerate(zip(frames, frames[1:])): # for i, (img1, img2) in enumerate(zip(frames, frames[1:])):
# # Note: it would be faster to predict batches of flows instead of individual flows # # Note: it would be faster to predict batches of flows instead of individual flows
# img1 = preprocess(img1[None]).to(device) # img1, img2 = preprocess(img1, img2)
# img2 = preprocess(img2[None]).to(device)
# list_of_flows = model(img1_batch, img2_batch) # list_of_flows = model(img1.to(device), img1.to(device))
# predicted_flow = list_of_flows[-1][0] # predicted_flow = list_of_flows[-1][0]
# flow_img = flow_to_image(predicted_flow).to("cpu") # flow_img = flow_to_image(predicted_flow).to("cpu")
# output_folder = "/tmp/" # Update this to the folder of your choice # output_folder = "/tmp/" # Update this to the folder of your choice
......
...@@ -139,12 +139,14 @@ show(drawn_boxes) ...@@ -139,12 +139,14 @@ show(drawn_boxes)
# Here is demo with a Faster R-CNN model loaded from # Here is demo with a Faster R-CNN model loaded from
# :func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn` # :func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn`
from torchvision.models.detection import fasterrcnn_resnet50_fpn from torchvision.models.detection import fasterrcnn_resnet50_fpn, FasterRCNN_ResNet50_FPN_Weights
model = fasterrcnn_resnet50_fpn(pretrained=True, progress=False) weights = FasterRCNN_ResNet50_FPN_Weights.DEFAULT
model = fasterrcnn_resnet50_fpn(weights=weights, progress=False)
print(img.size()) print(img.size())
img = F.convert_image_dtype(img, torch.float) tranforms = weights.transforms()
img = tranforms(img)
target = {} target = {}
target["boxes"] = boxes target["boxes"] = boxes
target["labels"] = labels = torch.ones((masks.size(0),), dtype=torch.int64) target["labels"] = labels = torch.ones((masks.size(0),), dtype=torch.int64)
......
...@@ -85,20 +85,16 @@ show([transformed_dog1, transformed_dog2]) ...@@ -85,20 +85,16 @@ show([transformed_dog1, transformed_dog2])
# Let's define a ``Predictor`` module that transforms the input tensor and then # Let's define a ``Predictor`` module that transforms the input tensor and then
# applies an ImageNet model on it. # applies an ImageNet model on it.
from torchvision.models import resnet18 from torchvision.models import resnet18, ResNet18_Weights
class Predictor(nn.Module): class Predictor(nn.Module):
def __init__(self): def __init__(self):
super().__init__() super().__init__()
self.resnet18 = resnet18(pretrained=True, progress=False).eval() weights = ResNet18_Weights.DEFAULT
self.transforms = nn.Sequential( self.resnet18 = resnet18(weights=weights, progress=False).eval()
T.Resize([256, ]), # We use single int value inside a list due to torchscript type restrictions self.transforms = weights.transforms()
T.CenterCrop(224),
T.ConvertImageDtype(torch.float),
T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
)
def forward(self, x: torch.Tensor) -> torch.Tensor: def forward(self, x: torch.Tensor) -> torch.Tensor:
with torch.no_grad(): with torch.no_grad():
......
...@@ -73,14 +73,17 @@ show(result) ...@@ -73,14 +73,17 @@ show(result)
# :func:`~torchvision.models.detection.ssd300_vgg16`. For more details # :func:`~torchvision.models.detection.ssd300_vgg16`. For more details
# on the output of such models, you may refer to :ref:`instance_seg_output`. # on the output of such models, you may refer to :ref:`instance_seg_output`.
from torchvision.models.detection import fasterrcnn_resnet50_fpn from torchvision.models.detection import fasterrcnn_resnet50_fpn, FasterRCNN_ResNet50_FPN_Weights
from torchvision.transforms.functional import convert_image_dtype
batch_int = torch.stack([dog1_int, dog2_int]) batch_int = torch.stack([dog1_int, dog2_int])
batch = convert_image_dtype(batch_int, dtype=torch.float)
model = fasterrcnn_resnet50_fpn(pretrained=True, progress=False) weights = FasterRCNN_ResNet50_FPN_Weights.DEFAULT
transforms = weights.transforms()
batch = transforms(batch_int)
model = fasterrcnn_resnet50_fpn(weights=weights, progress=False)
model = model.eval() model = model.eval()
outputs = model(batch) outputs = model(batch)
...@@ -120,13 +123,15 @@ show(dogs_with_boxes) ...@@ -120,13 +123,15 @@ show(dogs_with_boxes)
# images must be normalized before they're passed to a semantic segmentation # images must be normalized before they're passed to a semantic segmentation
# model. # model.
from torchvision.models.segmentation import fcn_resnet50 from torchvision.models.segmentation import fcn_resnet50, FCN_ResNet50_Weights
weights = FCN_ResNet50_Weights.DEFAULT
transforms = weights.transforms(resize_size=None)
model = fcn_resnet50(pretrained=True, progress=False) model = fcn_resnet50(weights=weights, progress=False)
model = model.eval() model = model.eval()
normalized_batch = F.normalize(batch, mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)) normalized_batch = transforms(batch)
output = model(normalized_batch)['out'] output = model(normalized_batch)['out']
print(output.shape, output.min().item(), output.max().item()) print(output.shape, output.min().item(), output.max().item())
...@@ -262,8 +267,14 @@ show(dogs_with_masks) ...@@ -262,8 +267,14 @@ show(dogs_with_masks)
# of them may not have masks, like # of them may not have masks, like
# :func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn`. # :func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn`.
from torchvision.models.detection import maskrcnn_resnet50_fpn from torchvision.models.detection import maskrcnn_resnet50_fpn, MaskRCNN_ResNet50_FPN_Weights
model = maskrcnn_resnet50_fpn(pretrained=True, progress=False)
weights = MaskRCNN_ResNet50_FPN_Weights.DEFAULT
transforms = weights.transforms()
batch = transforms(batch_int)
model = maskrcnn_resnet50_fpn(weights=weights, progress=False)
model = model.eval() model = model.eval()
output = model(batch) output = model(batch)
...@@ -378,13 +389,17 @@ show(dogs_with_masks) ...@@ -378,13 +389,17 @@ show(dogs_with_masks)
# Note that the keypoint detection model does not need normalized images. # Note that the keypoint detection model does not need normalized images.
# #
from torchvision.models.detection import keypointrcnn_resnet50_fpn from torchvision.models.detection import keypointrcnn_resnet50_fpn, KeypointRCNN_ResNet50_FPN_Weights
from torchvision.io import read_image from torchvision.io import read_image
person_int = read_image(str(Path("assets") / "person1.jpg")) person_int = read_image(str(Path("assets") / "person1.jpg"))
person_float = convert_image_dtype(person_int, dtype=torch.float)
model = keypointrcnn_resnet50_fpn(pretrained=True, progress=False) weights = KeypointRCNN_ResNet50_FPN_Weights.DEFAULT
transforms = weights.transforms()
person_float = transforms(person_int)
model = keypointrcnn_resnet50_fpn(weights=weights, progress=False)
model = model.eval() model = model.eval()
outputs = model([person_float]) outputs = model([person_float])
......
import torch import torch
import torchvision
from torch.utils.mobile_optimizer import optimize_for_mobile from torch.utils.mobile_optimizer import optimize_for_mobile
from torchvision.models.detection import (
fasterrcnn_mobilenet_v3_large_320_fpn,
FasterRCNN_MobileNet_V3_Large_320_FPN_Weights,
)
print(torch.__version__) print(torch.__version__)
model = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_320_fpn( model = fasterrcnn_mobilenet_v3_large_320_fpn(
pretrained=True, box_score_thresh=0.7, rpn_post_nms_top_n_test=100, rpn_score_thresh=0.4, rpn_pre_nms_top_n_test=150 weights=FasterRCNN_MobileNet_V3_Large_320_FPN_Weights.DEFAULT,
box_score_thresh=0.7,
rpn_post_nms_top_n_test=100,
rpn_score_thresh=0.4,
rpn_pre_nms_top_n_test=150,
) )
model.eval() model.eval()
......
...@@ -43,7 +43,7 @@ Since it expects tensors with a size of N x 3 x 299 x 299, to validate the model ...@@ -43,7 +43,7 @@ Since it expects tensors with a size of N x 3 x 299 x 299, to validate the model
``` ```
torchrun --nproc_per_node=8 train.py --model inception_v3\ torchrun --nproc_per_node=8 train.py --model inception_v3\
--val-resize-size 342 --val-crop-size 299 --train-crop-size 299 --test-only --pretrained --test-only --weights Inception_V3_Weights.IMAGENET1K_V1
``` ```
### ResNet ### ResNet
...@@ -96,22 +96,14 @@ The weights of the B5-B7 variants are ported from Luke Melas' [EfficientNet-PyTo ...@@ -96,22 +96,14 @@ The weights of the B5-B7 variants are ported from Luke Melas' [EfficientNet-PyTo
All models were trained using Bicubic interpolation and each have custom crop and resize sizes. To validate the models use the following commands: All models were trained using Bicubic interpolation and each have custom crop and resize sizes. To validate the models use the following commands:
``` ```
torchrun --nproc_per_node=8 train.py --model efficientnet_b0 --interpolation bicubic\ torchrun --nproc_per_node=8 train.py --model efficientnet_b0 --test-only --weights EfficientNet_B0_Weights.IMAGENET1K_V1
--val-resize-size 256 --val-crop-size 224 --train-crop-size 224 --test-only --pretrained torchrun --nproc_per_node=8 train.py --model efficientnet_b1 --test-only --weights EfficientNet_B1_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b1 --interpolation bicubic\ torchrun --nproc_per_node=8 train.py --model efficientnet_b2 --test-only --weights EfficientNet_B2_Weights.IMAGENET1K_V1
--val-resize-size 256 --val-crop-size 240 --train-crop-size 240 --test-only --pretrained torchrun --nproc_per_node=8 train.py --model efficientnet_b3 --test-only --weights EfficientNet_B3_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b2 --interpolation bicubic\ torchrun --nproc_per_node=8 train.py --model efficientnet_b4 --test-only --weights EfficientNet_B4_Weights.IMAGENET1K_V1
--val-resize-size 288 --val-crop-size 288 --train-crop-size 288 --test-only --pretrained torchrun --nproc_per_node=8 train.py --model efficientnet_b5 --test-only --weights EfficientNet_B5_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b3 --interpolation bicubic\ torchrun --nproc_per_node=8 train.py --model efficientnet_b6 --test-only --weights EfficientNet_B6_Weights.IMAGENET1K_V1
--val-resize-size 320 --val-crop-size 300 --train-crop-size 300 --test-only --pretrained torchrun --nproc_per_node=8 train.py --model efficientnet_b7 --test-only --weights EfficientNet_B7_Weights.IMAGENET1K_V1
torchrun --nproc_per_node=8 train.py --model efficientnet_b4 --interpolation bicubic\
--val-resize-size 384 --val-crop-size 380 --train-crop-size 380 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b5 --interpolation bicubic\
--val-resize-size 456 --val-crop-size 456 --train-crop-size 456 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b6 --interpolation bicubic\
--val-resize-size 528 --val-crop-size 528 --train-crop-size 528 --test-only --pretrained
torchrun --nproc_per_node=8 train.py --model efficientnet_b7 --interpolation bicubic\
--val-resize-size 600 --val-crop-size 600 --train-crop-size 600 --test-only --pretrained
``` ```
......
...@@ -6,6 +6,7 @@ from torchvision.transforms.functional import InterpolationMode ...@@ -6,6 +6,7 @@ from torchvision.transforms.functional import InterpolationMode
class ClassificationPresetTrain: class ClassificationPresetTrain:
def __init__( def __init__(
self, self,
*,
crop_size, crop_size,
mean=(0.485, 0.456, 0.406), mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225), std=(0.229, 0.224, 0.225),
...@@ -46,6 +47,7 @@ class ClassificationPresetTrain: ...@@ -46,6 +47,7 @@ class ClassificationPresetTrain:
class ClassificationPresetEval: class ClassificationPresetEval:
def __init__( def __init__(
self, self,
*,
crop_size, crop_size,
resize_size=256, resize_size=256,
mean=(0.485, 0.456, 0.406), mean=(0.485, 0.456, 0.406),
......
...@@ -15,12 +15,6 @@ from torch.utils.data.dataloader import default_collate ...@@ -15,12 +15,6 @@ from torch.utils.data.dataloader import default_collate
from torchvision.transforms.functional import InterpolationMode from torchvision.transforms.functional import InterpolationMode
try:
from torchvision import prototype
except ImportError:
prototype = None
def train_one_epoch(model, criterion, optimizer, data_loader, device, epoch, args, model_ema=None, scaler=None): def train_one_epoch(model, criterion, optimizer, data_loader, device, epoch, args, model_ema=None, scaler=None):
model.train() model.train()
metric_logger = utils.MetricLogger(delimiter=" ") metric_logger = utils.MetricLogger(delimiter=" ")
...@@ -154,16 +148,11 @@ def load_data(traindir, valdir, args): ...@@ -154,16 +148,11 @@ def load_data(traindir, valdir, args):
print(f"Loading dataset_test from {cache_path}") print(f"Loading dataset_test from {cache_path}")
dataset_test, _ = torch.load(cache_path) dataset_test, _ = torch.load(cache_path)
else: else:
if not args.prototype: if args.weights and args.test_only:
preprocessing = presets.ClassificationPresetEval( weights = torchvision.models.get_weight(args.weights)
crop_size=val_crop_size, resize_size=val_resize_size, interpolation=interpolation
)
else:
if args.weights:
weights = prototype.models.get_weight(args.weights)
preprocessing = weights.transforms() preprocessing = weights.transforms()
else: else:
preprocessing = prototype.transforms.ImageClassificationEval( preprocessing = presets.ClassificationPresetEval(
crop_size=val_crop_size, resize_size=val_resize_size, interpolation=interpolation crop_size=val_crop_size, resize_size=val_resize_size, interpolation=interpolation
) )
...@@ -191,10 +180,6 @@ def load_data(traindir, valdir, args): ...@@ -191,10 +180,6 @@ def load_data(traindir, valdir, args):
def main(args): def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
if args.output_dir: if args.output_dir:
utils.mkdir(args.output_dir) utils.mkdir(args.output_dir)
...@@ -236,10 +221,7 @@ def main(args): ...@@ -236,10 +221,7 @@ def main(args):
) )
print("Creating model") print("Creating model")
if not args.prototype: model = torchvision.models.__dict__[args.model](weights=args.weights, num_classes=num_classes)
model = torchvision.models.__dict__[args.model](pretrained=args.pretrained, num_classes=num_classes)
else:
model = prototype.models.__dict__[args.model](weights=args.weights, num_classes=num_classes)
model.to(device) model.to(device)
if args.distributed and args.sync_bn: if args.distributed and args.sync_bn:
...@@ -446,12 +428,6 @@ def get_args_parser(add_help=True): ...@@ -446,12 +428,6 @@ def get_args_parser(add_help=True):
help="Only test the model", help="Only test the model",
action="store_true", action="store_true",
) )
parser.add_argument(
"--pretrained",
dest="pretrained",
help="Use pre-trained models from the modelzoo",
action="store_true",
)
parser.add_argument("--auto-augment", default=None, type=str, help="auto augment policy (default: None)") parser.add_argument("--auto-augment", default=None, type=str, help="auto augment policy (default: None)")
parser.add_argument("--random-erase", default=0.0, type=float, help="random erasing probability (default: 0.0)") parser.add_argument("--random-erase", default=0.0, type=float, help="random erasing probability (default: 0.0)")
...@@ -496,14 +472,6 @@ def get_args_parser(add_help=True): ...@@ -496,14 +472,6 @@ def get_args_parser(add_help=True):
parser.add_argument( parser.add_argument(
"--ra-reps", default=3, type=int, help="number of repetitions for Repeated Augmentation (default: 3)" "--ra-reps", default=3, type=int, help="number of repetitions for Repeated Augmentation (default: 3)"
) )
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load") parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load")
return parser return parser
......
...@@ -12,17 +12,7 @@ from torch import nn ...@@ -12,17 +12,7 @@ from torch import nn
from train import train_one_epoch, evaluate, load_data from train import train_one_epoch, evaluate, load_data
try:
from torchvision import prototype
except ImportError:
prototype = None
def main(args): def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
if args.output_dir: if args.output_dir:
utils.mkdir(args.output_dir) utils.mkdir(args.output_dir)
...@@ -56,10 +46,7 @@ def main(args): ...@@ -56,10 +46,7 @@ def main(args):
print("Creating model", args.model) print("Creating model", args.model)
# when training quantized models, we always start from a pre-trained fp32 reference model # when training quantized models, we always start from a pre-trained fp32 reference model
if not args.prototype: model = torchvision.models.quantization.__dict__[args.model](weights=args.weights, quantize=args.test_only)
model = torchvision.models.quantization.__dict__[args.model](pretrained=True, quantize=args.test_only)
else:
model = prototype.models.quantization.__dict__[args.model](weights=args.weights, quantize=args.test_only)
model.to(device) model.to(device)
if not (args.test_only or args.post_training_quantize): if not (args.test_only or args.post_training_quantize):
...@@ -264,14 +251,6 @@ def get_args_parser(add_help=True): ...@@ -264,14 +251,6 @@ def get_args_parser(add_help=True):
"--train-crop-size", default=224, type=int, help="the random crop size used for training (default: 224)" "--train-crop-size", default=224, type=int, help="the random crop size used for training (default: 224)"
) )
parser.add_argument("--clip-grad-norm", default=None, type=float, help="the maximum gradient norm (default None)") parser.add_argument("--clip-grad-norm", default=None, type=float, help="the maximum gradient norm (default None)")
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load") parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load")
return parser return parser
......
...@@ -330,22 +330,22 @@ def store_model_weights(model, checkpoint_path, checkpoint_key="model", strict=T ...@@ -330,22 +330,22 @@ def store_model_weights(model, checkpoint_path, checkpoint_key="model", strict=T
from torchvision import models as M from torchvision import models as M
# Classification # Classification
model = M.mobilenet_v3_large(pretrained=False) model = M.mobilenet_v3_large(weights=None)
print(store_model_weights(model, './class.pth')) print(store_model_weights(model, './class.pth'))
# Quantized Classification # Quantized Classification
model = M.quantization.mobilenet_v3_large(pretrained=False, quantize=False) model = M.quantization.mobilenet_v3_large(weights=None, quantize=False)
model.fuse_model(is_qat=True) model.fuse_model(is_qat=True)
model.qconfig = torch.ao.quantization.get_default_qat_qconfig('qnnpack') model.qconfig = torch.ao.quantization.get_default_qat_qconfig('qnnpack')
_ = torch.ao.quantization.prepare_qat(model, inplace=True) _ = torch.ao.quantization.prepare_qat(model, inplace=True)
print(store_model_weights(model, './qat.pth')) print(store_model_weights(model, './qat.pth'))
# Object Detection # Object Detection
model = M.detection.fasterrcnn_mobilenet_v3_large_fpn(pretrained=False, pretrained_backbone=False) model = M.detection.fasterrcnn_mobilenet_v3_large_fpn(weights=None, weights_backbone=None)
print(store_model_weights(model, './obj.pth')) print(store_model_weights(model, './obj.pth'))
# Segmentation # Segmentation
model = M.segmentation.deeplabv3_mobilenet_v3_large(pretrained=False, pretrained_backbone=False, aux_loss=True) model = M.segmentation.deeplabv3_mobilenet_v3_large(weights=None, weights_backbone=None, aux_loss=True)
print(store_model_weights(model, './segm.pth', strict=False)) print(store_model_weights(model, './segm.pth', strict=False))
Args: Args:
......
...@@ -24,35 +24,35 @@ Except otherwise noted, all models have been trained on 8x V100 GPUs. ...@@ -24,35 +24,35 @@ Except otherwise noted, all models have been trained on 8x V100 GPUs.
``` ```
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model fasterrcnn_resnet50_fpn --epochs 26\ --dataset coco --model fasterrcnn_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
``` ```
### Faster R-CNN MobileNetV3-Large FPN ### Faster R-CNN MobileNetV3-Large FPN
``` ```
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model fasterrcnn_mobilenet_v3_large_fpn --epochs 26\ --dataset coco --model fasterrcnn_mobilenet_v3_large_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone MobileNet_V3_Large_Weights.IMAGENET1K_V1
``` ```
### Faster R-CNN MobileNetV3-Large 320 FPN ### Faster R-CNN MobileNetV3-Large 320 FPN
``` ```
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model fasterrcnn_mobilenet_v3_large_320_fpn --epochs 26\ --dataset coco --model fasterrcnn_mobilenet_v3_large_320_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone MobileNet_V3_Large_Weights.IMAGENET1K_V1
``` ```
### FCOS ResNet-50 FPN ### FCOS ResNet-50 FPN
``` ```
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model fcos_resnet50_fpn --epochs 26\ --dataset coco --model fcos_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --amp --lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --amp --weights-backbone ResNet50_Weights.IMAGENET1K_V1
``` ```
### RetinaNet ### RetinaNet
``` ```
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model retinanet_resnet50_fpn --epochs 26\ --dataset coco --model retinanet_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
``` ```
### SSD300 VGG16 ### SSD300 VGG16
...@@ -60,7 +60,7 @@ torchrun --nproc_per_node=8 train.py\ ...@@ -60,7 +60,7 @@ torchrun --nproc_per_node=8 train.py\
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model ssd300_vgg16 --epochs 120\ --dataset coco --model ssd300_vgg16 --epochs 120\
--lr-steps 80 110 --aspect-ratio-group-factor 3 --lr 0.002 --batch-size 4\ --lr-steps 80 110 --aspect-ratio-group-factor 3 --lr 0.002 --batch-size 4\
--weight-decay 0.0005 --data-augmentation ssd --weight-decay 0.0005 --data-augmentation ssd --weights-backbone VGG16_Weights.IMAGENET1K_FEATURES
``` ```
### SSDlite320 MobileNetV3-Large ### SSDlite320 MobileNetV3-Large
...@@ -68,7 +68,7 @@ torchrun --nproc_per_node=8 train.py\ ...@@ -68,7 +68,7 @@ torchrun --nproc_per_node=8 train.py\
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model ssdlite320_mobilenet_v3_large --epochs 660\ --dataset coco --model ssdlite320_mobilenet_v3_large --epochs 660\
--aspect-ratio-group-factor 3 --lr-scheduler cosineannealinglr --lr 0.15 --batch-size 24\ --aspect-ratio-group-factor 3 --lr-scheduler cosineannealinglr --lr 0.15 --batch-size 24\
--weight-decay 0.00004 --data-augmentation ssdlite --weight-decay 0.00004 --data-augmentation ssdlite --weights-backbone MobileNet_V3_Large_Weights.IMAGENET1K_V1
``` ```
...@@ -76,7 +76,7 @@ torchrun --nproc_per_node=8 train.py\ ...@@ -76,7 +76,7 @@ torchrun --nproc_per_node=8 train.py\
``` ```
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco --model maskrcnn_resnet50_fpn --epochs 26\ --dataset coco --model maskrcnn_resnet50_fpn --epochs 26\
--lr-steps 16 22 --aspect-ratio-group-factor 3 --lr-steps 16 22 --aspect-ratio-group-factor 3 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
``` ```
...@@ -84,5 +84,5 @@ torchrun --nproc_per_node=8 train.py\ ...@@ -84,5 +84,5 @@ torchrun --nproc_per_node=8 train.py\
``` ```
torchrun --nproc_per_node=8 train.py\ torchrun --nproc_per_node=8 train.py\
--dataset coco_kp --model keypointrcnn_resnet50_fpn --epochs 46\ --dataset coco_kp --model keypointrcnn_resnet50_fpn --epochs 46\
--lr-steps 36 43 --aspect-ratio-group-factor 3 --lr-steps 36 43 --aspect-ratio-group-factor 3 --weights-backbone ResNet50_Weights.IMAGENET1K_V1
``` ```
...@@ -33,12 +33,6 @@ from engine import train_one_epoch, evaluate ...@@ -33,12 +33,6 @@ from engine import train_one_epoch, evaluate
from group_by_aspect_ratio import GroupedBatchSampler, create_aspect_ratio_groups from group_by_aspect_ratio import GroupedBatchSampler, create_aspect_ratio_groups
try:
from torchvision import prototype
except ImportError:
prototype = None
def get_dataset(name, image_set, transform, data_path): def get_dataset(name, image_set, transform, data_path):
paths = {"coco": (data_path, get_coco, 91), "coco_kp": (data_path, get_coco_kp, 2)} paths = {"coco": (data_path, get_coco, 91), "coco_kp": (data_path, get_coco_kp, 2)}
p, ds_fn, num_classes = paths[name] p, ds_fn, num_classes = paths[name]
...@@ -49,15 +43,13 @@ def get_dataset(name, image_set, transform, data_path): ...@@ -49,15 +43,13 @@ def get_dataset(name, image_set, transform, data_path):
def get_transform(train, args): def get_transform(train, args):
if train: if train:
return presets.DetectionPresetTrain(args.data_augmentation) return presets.DetectionPresetTrain(data_augmentation=args.data_augmentation)
elif not args.prototype: elif args.weights and args.test_only:
return presets.DetectionPresetEval() weights = torchvision.models.get_weight(args.weights)
else: trans = weights.transforms()
if args.weights: return lambda img, target: (trans(img), target)
weights = prototype.models.get_weight(args.weights)
return weights.transforms()
else: else:
return prototype.transforms.ObjectDetectionEval() return presets.DetectionPresetEval()
def get_args_parser(add_help=True): def get_args_parser(add_help=True):
...@@ -132,25 +124,12 @@ def get_args_parser(add_help=True): ...@@ -132,25 +124,12 @@ def get_args_parser(add_help=True):
help="Only test the model", help="Only test the model",
action="store_true", action="store_true",
) )
parser.add_argument(
"--pretrained",
dest="pretrained",
help="Use pre-trained models from the modelzoo",
action="store_true",
)
# distributed training parameters # distributed training parameters
parser.add_argument("--world-size", default=1, type=int, help="number of distributed processes") parser.add_argument("--world-size", default=1, type=int, help="number of distributed processes")
parser.add_argument("--dist-url", default="env://", type=str, help="url used to set up distributed training") parser.add_argument("--dist-url", default="env://", type=str, help="url used to set up distributed training")
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load") parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load")
parser.add_argument("--weights-backbone", default=None, type=str, help="the backbone weights enum name to load")
# Mixed precision training parameters # Mixed precision training parameters
parser.add_argument("--amp", action="store_true", help="Use torch.cuda.amp for mixed precision training") parser.add_argument("--amp", action="store_true", help="Use torch.cuda.amp for mixed precision training")
...@@ -159,10 +138,6 @@ def get_args_parser(add_help=True): ...@@ -159,10 +138,6 @@ def get_args_parser(add_help=True):
def main(args): def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
if args.output_dir: if args.output_dir:
utils.mkdir(args.output_dir) utils.mkdir(args.output_dir)
...@@ -204,12 +179,9 @@ def main(args): ...@@ -204,12 +179,9 @@ def main(args):
if "rcnn" in args.model: if "rcnn" in args.model:
if args.rpn_score_thresh is not None: if args.rpn_score_thresh is not None:
kwargs["rpn_score_thresh"] = args.rpn_score_thresh kwargs["rpn_score_thresh"] = args.rpn_score_thresh
if not args.prototype:
model = torchvision.models.detection.__dict__[args.model]( model = torchvision.models.detection.__dict__[args.model](
pretrained=args.pretrained, num_classes=num_classes, **kwargs weights=args.weights, weights_backbone=args.weights_backbone, num_classes=num_classes, **kwargs
) )
else:
model = prototype.models.detection.__dict__[args.model](weights=args.weights, num_classes=num_classes, **kwargs)
model.to(device) model.to(device)
if args.distributed and args.sync_bn: if args.distributed and args.sync_bn:
model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model) model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model)
......
...@@ -51,7 +51,7 @@ torchrun --nproc_per_node 8 --nnodes 1 train.py \ ...@@ -51,7 +51,7 @@ torchrun --nproc_per_node 8 --nnodes 1 train.py \
### Evaluation ### Evaluation
``` ```
torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset sintel --batch-size 1 --dataset-root $dataset_root --model raft_large --pretrained torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset sintel --batch-size 1 --dataset-root $dataset_root --model raft_large --weights Raft_Large_Weights.C_T_SKHT_V2
``` ```
This should give an epe of about 1.3822 on the clean pass and 2.7161 on the This should give an epe of about 1.3822 on the clean pass and 2.7161 on the
...@@ -67,6 +67,6 @@ Sintel val final epe: 2.7161 1px: 0.8528 3px: 0.9204 5px: 0.9392 per_image_epe: ...@@ -67,6 +67,6 @@ Sintel val final epe: 2.7161 1px: 0.8528 3px: 0.9204 5px: 0.9392 per_image_epe:
You can also evaluate on Kitti train: You can also evaluate on Kitti train:
``` ```
torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset kitti --batch-size 1 --dataset-root $dataset_root --model raft_large --pretrained torchrun --nproc_per_node 1 --nnodes 1 train.py --val-dataset kitti --batch-size 1 --dataset-root $dataset_root --model raft_large --weights Raft_Large_Weights.C_T_SKHT_V2
Kitti val epe: 4.7968 1px: 0.6388 3px: 0.8197 5px: 0.8661 per_image_epe: 4.5118 f1: 16.0679 Kitti val epe: 4.7968 1px: 0.6388 3px: 0.8197 5px: 0.8661 per_image_epe: 4.5118 f1: 16.0679
``` ```
...@@ -22,6 +22,7 @@ class OpticalFlowPresetEval(torch.nn.Module): ...@@ -22,6 +22,7 @@ class OpticalFlowPresetEval(torch.nn.Module):
class OpticalFlowPresetTrain(torch.nn.Module): class OpticalFlowPresetTrain(torch.nn.Module):
def __init__( def __init__(
self, self,
*,
# RandomResizeAndCrop params # RandomResizeAndCrop params
crop_size, crop_size,
min_scale=-0.2, min_scale=-0.2,
......
...@@ -9,11 +9,6 @@ import utils ...@@ -9,11 +9,6 @@ import utils
from presets import OpticalFlowPresetTrain, OpticalFlowPresetEval from presets import OpticalFlowPresetTrain, OpticalFlowPresetEval
from torchvision.datasets import KittiFlow, FlyingChairs, FlyingThings3D, Sintel, HD1K from torchvision.datasets import KittiFlow, FlyingChairs, FlyingThings3D, Sintel, HD1K
try:
from torchvision import prototype
except ImportError:
prototype = None
def get_train_dataset(stage, dataset_root): def get_train_dataset(stage, dataset_root):
if stage == "chairs": if stage == "chairs":
...@@ -138,12 +133,18 @@ def _evaluate(model, args, val_dataset, *, padder_mode, num_flow_updates=None, b ...@@ -138,12 +133,18 @@ def _evaluate(model, args, val_dataset, *, padder_mode, num_flow_updates=None, b
def evaluate(model, args): def evaluate(model, args):
val_datasets = args.val_dataset or [] val_datasets = args.val_dataset or []
if args.prototype: if args.weights and args.test_only:
if args.weights: weights = torchvision.models.get_weight(args.weights)
weights = prototype.models.get_weight(args.weights) trans = weights.transforms()
preprocessing = weights.transforms()
else: def preprocessing(img1, img2, flow, valid_flow_mask):
preprocessing = prototype.transforms.OpticalFlowEval() img1, img2 = trans(img1, img2)
if flow is not None and not isinstance(flow, torch.Tensor):
flow = torch.from_numpy(flow)
if valid_flow_mask is not None and not isinstance(valid_flow_mask, torch.Tensor):
valid_flow_mask = torch.from_numpy(valid_flow_mask)
return img1, img2, flow, valid_flow_mask
else: else:
preprocessing = OpticalFlowPresetEval() preprocessing = OpticalFlowPresetEval()
...@@ -201,20 +202,14 @@ def train_one_epoch(model, optimizer, scheduler, train_loader, logger, args): ...@@ -201,20 +202,14 @@ def train_one_epoch(model, optimizer, scheduler, train_loader, logger, args):
def main(args): def main(args):
if args.prototype and prototype is None:
raise ImportError("The prototype module couldn't be found. Please install the latest torchvision nightly.")
if not args.prototype and args.weights:
raise ValueError("The weights parameter works only in prototype mode. Please pass the --prototype argument.")
utils.setup_ddp(args) utils.setup_ddp(args)
args.test_only = args.train_dataset is None
if args.distributed and args.device == "cpu": if args.distributed and args.device == "cpu":
raise ValueError("The device must be cuda if we want to run in distributed mode using torchrun") raise ValueError("The device must be cuda if we want to run in distributed mode using torchrun")
device = torch.device(args.device) device = torch.device(args.device)
if args.prototype: model = torchvision.models.optical_flow.__dict__[args.model](weights=args.weights)
model = prototype.models.optical_flow.__dict__[args.model](weights=args.weights)
else:
model = torchvision.models.optical_flow.__dict__[args.model](pretrained=args.pretrained)
if args.distributed: if args.distributed:
model = model.to(args.local_rank) model = model.to(args.local_rank)
...@@ -228,7 +223,7 @@ def main(args): ...@@ -228,7 +223,7 @@ def main(args):
checkpoint = torch.load(args.resume, map_location="cpu") checkpoint = torch.load(args.resume, map_location="cpu")
model_without_ddp.load_state_dict(checkpoint["model"]) model_without_ddp.load_state_dict(checkpoint["model"])
if args.train_dataset is None: if args.test_only:
# Set deterministic CUDNN algorithms, since they can affect epe a fair bit. # Set deterministic CUDNN algorithms, since they can affect epe a fair bit.
torch.backends.cudnn.benchmark = False torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True torch.backends.cudnn.deterministic = True
...@@ -356,8 +351,7 @@ def get_args_parser(add_help=True): ...@@ -356,8 +351,7 @@ def get_args_parser(add_help=True):
parser.add_argument( parser.add_argument(
"--model", type=str, default="raft_large", help="The name of the model to use - either raft_large or raft_small" "--model", type=str, default="raft_large", help="The name of the model to use - either raft_large or raft_small"
) )
# TODO: resume, pretrained, and weights should be in an exclusive arg group # TODO: resume and weights should be in an exclusive arg group
parser.add_argument("--pretrained", action="store_true", help="Whether to use pretrained weights")
parser.add_argument( parser.add_argument(
"--num_flow_updates", "--num_flow_updates",
...@@ -376,13 +370,6 @@ def get_args_parser(add_help=True): ...@@ -376,13 +370,6 @@ def get_args_parser(add_help=True):
required=True, required=True,
) )
# Prototype models only
parser.add_argument(
"--prototype",
dest="prototype",
help="Use prototype model builders instead those from main area",
action="store_true",
)
parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load.") parser.add_argument("--weights", default=None, type=str, help="the weights enum name to load.")
parser.add_argument("--device", default="cuda", type=str, help="device (Use cuda or cpu, Default: cuda)") parser.add_argument("--device", default="cuda", type=str, help="device (Use cuda or cpu, Default: cuda)")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment