Commit 60443d7f authored by A. Unique TensorFlower's avatar A. Unique TensorFlower
Browse files

Merge pull request #9536 from PurdueCAM2Project:yolo

PiperOrigin-RevId: 347467379
parents 3b7017a5 84df9351
# YOLO Object Detectors, You Only Look Once
[![Paper](http://img.shields.io/badge/Paper-arXiv.1804.02767-B3181B?logo=arXiv)](https://arxiv.org/abs/1804.02767)
[![Paper](http://img.shields.io/badge/Paper-arXiv.2004.10934-B3181B?logo=arXiv)](https://arxiv.org/abs/2004.10934)
This repository is the unofficial implementation of the following papers.
However, we spent painstaking hours ensuring that every aspect that we
constructed was the exact same as the original paper and the original
repository.
* YOLOv3: An Incremental Improvement: [YOLOv3: An Incremental Improvement](https://arxiv.org/abs/1804.02767)
* YOLOv4: Optimal Speed and Accuracy of Object Detection: [YOLOv4: Optimal Speed and Accuracy of Object Detection](https://arxiv.org/abs/2004.10934)
## Description
Yolo v1 the original implementation was released in 2015 providing a ground
breaking algorithm that would quickly process images, and locate objects in a
single pass through the detector. The original implementation based used a
backbone derived from state of the art object classifier of the time, like
[GoogLeNet](https://arxiv.org/abs/1409.4842) and
[VGG](https://arxiv.org/abs/1409.1556). More attention was given to the novel
Yolo Detection head that allowed for Object Detection with a single pass of an
image. Though limited, the network could predict up to 90 bounding boxes per
image, and was tested for about 80 classes per box. Also, the model could only
make prediction at one scale. These attributes caused yolo v1 to be more
limited, and less versatile, so as the year passed, the Developers continued to
update and develop this model.
Yolo v3 and v4 serve as the most up to date and capable versions of the Yolo
network group. These model uses a custom backbone called Darknet53 that uses
knowledge gained from the ResNet paper to improve its predictions. The new
backbone also allows for objects to be detected at multiple scales. As for the
new detection head, the model now predicts the bounding boxes using a set of
anchor box priors (Anchor Boxes) as suggestions. The multiscale predictions in
combination with the Anchor boxes allows for the network to make up to 1000
object predictions on a single image. Finally, the new loss function forces the
network to make better prediction by using Intersection Over Union (IOU) to
inform the model's confidence rather than relying on the mean squared error for
the entire output.
## Authors
* Vishnu Samardh Banna ([@GitHub vishnubanna](https://github.com/vishnubanna))
* Anirudh Vegesana ([@GitHub anivegesana](https://github.com/anivegesana))
* Akhil Chinnakotla ([@GitHub The-Indian-Chinna](https://github.com/The-Indian-Chinna))
* Tristan Yan ([@GitHub Tyan3001](https://github.com/Tyan3001))
* Naveen Vivek ([@GitHub naveen-vivek](https://github.com/naveen-vivek))
## Table of Contents
* [Our Goal](#our-goal)
* [Models in the library](#models-in-the-library)
* [References](#references)
## Our Goal
Our goal with this model conversion is to provide implementations of the
Backbone and Yolo Head. We have built the model in such a way that the Yolo
head could be connected to a new, more powerful backbone if a person chose to.
## Models in the library
| Object Detectors | Classifiers |
| :--------------: | :--------------: |
| Yolo-v3 | Darknet53 |
| Yolo-v3 tiny | CSPDarknet53 |
| Yolo-v3 spp |
| Yolo-v4 |
| Yolo-v4 tiny |
## Requirements
[![TensorFlow 2.2](https://img.shields.io/badge/TensorFlow-2.2-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.2.0)
[![Python 3.8](https://img.shields.io/badge/Python-3.8-3776AB)](https://www.python.org/downloads/release/python-380/)
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""All necessary imports for registration."""
# pylint: disable=unused-import
from official.common import registry_imports
from official.vision.beta.projects.yolo.configs import darknet_classification
from official.vision.beta.projects.yolo.modeling.backbones import darknet
from official.vision.beta.projects.yolo.tasks import image_classification
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Backbones configurations."""
import dataclasses
from official.modeling import hyperparams
from official.vision.beta.configs import backbones
@dataclasses.dataclass
class DarkNet(hyperparams.Config):
"""DarkNet config."""
model_id: str = "darknet53"
@dataclasses.dataclass
class Backbone(backbones.Backbone):
darknet: DarkNet = DarkNet()
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Image classification with darknet configs."""
from typing import List, Optional
import dataclasses
from official.core import config_definitions as cfg
from official.core import exp_factory
from official.modeling import hyperparams
from official.vision.beta.configs import common
from official.vision.beta.configs import image_classification as imc
from official.vision.beta.projects.yolo.configs import backbones
@dataclasses.dataclass
class ImageClassificationModel(hyperparams.Config):
num_classes: int = 0
input_size: List[int] = dataclasses.field(default_factory=list)
backbone: backbones.Backbone = backbones.Backbone(
type='darknet', resnet=backbones.DarkNet())
dropout_rate: float = 0.0
norm_activation: common.NormActivation = common.NormActivation()
# Adds a BatchNormalization layer pre-GlobalAveragePooling in classification
add_head_batch_norm: bool = False
@dataclasses.dataclass
class Losses(hyperparams.Config):
one_hot: bool = True
label_smoothing: float = 0.0
l2_weight_decay: float = 0.0
@dataclasses.dataclass
class ImageClassificationTask(cfg.TaskConfig):
"""The model config."""
model: ImageClassificationModel = ImageClassificationModel()
train_data: imc.DataConfig = imc.DataConfig(is_training=True)
validation_data: imc.DataConfig = imc.DataConfig(is_training=False)
evaluation: imc.Evaluation = imc.Evaluation()
losses: Losses = Losses()
gradient_clip_norm: float = 0.0
logging_dir: Optional[str] = None
@exp_factory.register_config_factory('darknet_classification')
def image_classification() -> cfg.ExperimentConfig:
"""Image classification general."""
return cfg.ExperimentConfig(
task=ImageClassificationTask(),
trainer=cfg.TrainerConfig(),
restrictions=[
'task.train_data.is_training != None',
'task.validation_data.is_training != None'
])
runtime:
distribution_strategy: 'mirrored'
mixed_precision_dtype: 'float32'
task:
model:
num_classes: 1001
input_size: [256, 256, 3]
backbone:
type: 'darknet'
darknet:
model_id: 'cspdarknet53'
norm_activation:
activation: 'mish'
losses:
l2_weight_decay: 0.0005
one_hot: true
label_smoothing: 0.1
train_data:
input_path: 'imagenet-2012-tfrecord/train*'
is_training: true
global_batch_size: 128
dtype: 'float16'
validation_data:
input_path: 'imagenet-2012-tfrecord/valid*'
is_training: true
global_batch_size: 128
dtype: 'float16'
drop_remainder: false
trainer:
train_steps: 1200000 # epochs: 120
validation_steps: 400 # size of validation data
validation_interval: 10000
steps_per_loop: 10000
summary_interval: 10000
checkpoint_interval: 10000
optimizer_config:
optimizer:
type: 'sgd'
sgd:
momentum: 0.9
learning_rate:
type: 'polynomial'
polynomial:
initial_learning_rate: 0.1
end_learning_rate: 0.0001
power: 4.0
decay_steps: 1200000
warmup:
type: 'linear'
linear:
warmup_steps: 1000 # learning rate rises from 0 to 0.1 over 1000 steps
runtime:
distribution_strategy: 'mirrored'
mixed_precision_dtype: 'float16'
loss_scale: 'dynamic'
num_gpus: 2
task:
model:
num_classes: 1001
input_size: [256, 256, 3]
backbone:
type: 'darknet'
darknet:
model_id: 'cspdarknet53'
norm_activation:
activation: 'mish'
losses:
l2_weight_decay: 0.0005
one_hot: true
train_data:
tfds_name: 'imagenet2012'
tfds_split: 'train'
tfds_data_dir: '~/tensorflow_datasets/'
tfds_download: true
is_training: true
global_batch_size: 16 # default = 128
dtype: 'float16'
shuffle_buffer_size: 100
validation_data:
tfds_name: 'imagenet2012'
tfds_split: 'validation'
tfds_data_dir: '~/tensorflow_datasets/'
tfds_download: true
is_training: true
global_batch_size: 16 # default = 128
dtype: 'float16'
drop_remainder: false
shuffle_buffer_size: 100
trainer:
train_steps: 9600000 # epochs: 120, 1200000 * 128/batchsize
validation_steps: 3200 # size of validation data, 400 * 128/batchsize
validation_interval: 10000 # 10000
steps_per_loop: 10000
summary_interval: 10000
checkpoint_interval: 10000
optimizer_config:
optimizer:
type: 'sgd'
sgd:
momentum: 0.9
learning_rate:
type: 'polynomial'
polynomial:
initial_learning_rate: 0.0125 # 0.1 * batchsize/128, default = 0.1
end_learning_rate: 0.0000125 # 0.0001 * batchsize/128, default = 0.0001
power: 4.0
decay_steps: 9592000 # 790000 * 128/batchsize, default = 800000 - 1000 = 799000
warmup:
type: 'linear'
linear:
warmup_steps: 8000 # 0 to 0.1 over 1000 * 128/batchsize, default = 128
runtime:
distribution_strategy: 'mirrored'
mixed_precision_dtype: 'float32'
task:
model:
num_classes: 1001
input_size: [256, 256, 3]
backbone:
type: 'darknet'
darknet:
model_id: 'darknet53'
norm_activation:
activation: 'mish'
losses:
l2_weight_decay: 0.0005
one_hot: true
train_data:
input_path: 'imagenet-2012-tfrecord/train*'
is_training: true
global_batch_size: 128
dtype: 'float16'
validation_data:
input_path: 'imagenet-2012-tfrecord/valid*'
is_training: true
global_batch_size: 128
dtype: 'float16'
drop_remainder: false
trainer:
train_steps: 800000 # epochs: 80
validation_steps: 400 # size of validation data
validation_interval: 10000
steps_per_loop: 10000
summary_interval: 10000
checkpoint_interval: 10000
optimizer_config:
optimizer:
type: 'sgd'
sgd:
momentum: 0.9
learning_rate:
type: 'polynomial'
polynomial:
initial_learning_rate: 0.1
end_learning_rate: 0.0001
power: 4.0
decay_steps: 800000
warmup:
type: 'linear'
linear:
warmup_steps: 1000 # learning rate rises from 0 to 0.1 over 1000 steps
runtime:
distribution_strategy: 'mirrored'
mixed_precision_dtype: 'float16'
loss_scale: 'dynamic'
num_gpus: 2
task:
model:
num_classes: 1001
input_size: [256, 256, 3]
backbone:
type: 'darknet'
darknet:
model_id: 'darknet53'
norm_activation:
activation: 'mish'
losses:
l2_weight_decay: 0.0005
one_hot: true
train_data:
tfds_name: 'imagenet2012'
tfds_split: 'train'
tfds_data_dir: '~/tensorflow_datasets/'
tfds_download: true
is_training: true
global_batch_size: 16 # default = 128
dtype: 'float16'
shuffle_buffer_size: 100
validation_data:
tfds_name: 'imagenet2012'
tfds_split: 'validation'
tfds_data_dir: '~/tensorflow_datasets/'
tfds_download: true
is_training: true
global_batch_size: 16 # default = 128
dtype: 'float16'
drop_remainder: false
shuffle_buffer_size: 100
trainer:
train_steps: 6400000 # epochs: 80, 800000 * 128/batchsize
validation_steps: 3200 # size of validation data, 400 * 128/batchsize
validation_interval: 10000 # 10000
steps_per_loop: 10000
summary_interval: 10000
checkpoint_interval: 10000
optimizer_config:
optimizer:
type: 'sgd'
sgd:
momentum: 0.9
learning_rate:
type: 'polynomial'
polynomial:
initial_learning_rate: 0.0125 # 0.1 * batchsize/128, default = 0.1
end_learning_rate: 0.0000125 # 0.0001 * batchsize/128, default = 0.0001
power: 4.0
decay_steps: 6392000 # 790000 * 128/batchsize, default = 800000 - 1000 = 799000
warmup:
type: 'linear'
linear:
warmup_steps: 8000 # 0 to 0.1 over 1000 * 128/batchsize, default = 128
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""TFDS Classification decoder."""
import tensorflow as tf
from official.vision.beta.dataloaders import decoder
class Decoder(decoder.Decoder):
"""A tf.Example decoder for classification task."""
def __init__(self):
return
def decode(self, serialized_example):
sample_dict = {
'image/encoded': tf.io.encode_jpeg(
serialized_example['image'], quality=100),
'image/class/label': serialized_example['label'],
}
return sample_dict
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains definitions of Darknet Backbone Networks.
The models are inspired by ResNet, and CSPNet
Residual networks (ResNets) were proposed in:
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Deep Residual Learning for Image Recognition. arXiv:1512.03385
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen,
Jun-Wei Hsieh
CSPNet: A New Backbone that can Enhance Learning Capability of CNN.
arXiv:1911.11929
DarkNets Are used mainly for Object detection in:
[1] Joseph Redmon, Ali Farhadi
YOLOv3: An Incremental Improvement. arXiv:1804.02767
[2] Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao
YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv:2004.10934
"""
import collections
import tensorflow as tf
from official.vision.beta.modeling.backbones import factory
from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
class BlockConfig(object):
"""Get layer config to make code more readable.
Args:
layer: string layer name
stack: the type of layer ordering to use for this specific level
repetitions: integer for the number of times to repeat block
bottelneck: boolean for does this stack have a bottle neck layer
filters: integer for the output depth of the level
pool_size: integer the pool_size of max pool layers
kernel_size: optional integer, for convolution kernel size
strides: integer or tuple to indicate convolution strides
padding: the padding to apply to layers in this stack
activation: string for the activation to use for this stack
route: integer for what level to route from to get the next input
output_name: the name to use for this output
is_output: is this layer an output in the default model
"""
def __init__(self, layer, stack, reps, bottleneck, filters, pool_size,
kernel_size, strides, padding, activation, route, output_name,
is_output):
self.layer = layer
self.stack = stack
self.repetitions = reps
self.bottleneck = bottleneck
self.filters = filters
self.kernel_size = kernel_size
self.pool_size = pool_size
self.strides = strides
self.padding = padding
self.activation = activation
self.route = route
self.output_name = output_name
self.is_output = is_output
def build_block_specs(config):
specs = []
for layer in config:
specs.append(BlockConfig(*layer))
return specs
class LayerFactory(object):
"""Class for quick look up of default layers.
Used by darknet to connect, introduce or exit a level. Used in place of an if
condition or switch to make adding new layers easier and to reduce redundant
code.
"""
def __init__(self):
self._layer_dict = {
"ConvBN": (nn_blocks.ConvBN, self.conv_bn_config_todict),
"MaxPool": (tf.keras.layers.MaxPool2D, self.maxpool_config_todict)
}
def conv_bn_config_todict(self, config, kwargs):
dictvals = {
"filters": config.filters,
"kernel_size": config.kernel_size,
"strides": config.strides,
"padding": config.padding
}
dictvals.update(kwargs)
return dictvals
def darktiny_config_todict(self, config, kwargs):
dictvals = {"filters": config.filters, "strides": config.strides}
dictvals.update(kwargs)
return dictvals
def maxpool_config_todict(self, config, kwargs):
return {
"pool_size": config.pool_size,
"strides": config.strides,
"padding": config.padding,
"name": kwargs["name"]
}
def __call__(self, config, kwargs):
layer, get_param_dict = self._layer_dict[config.layer]
param_dict = get_param_dict(config, kwargs)
return layer(**param_dict)
# model configs
LISTNAMES = [
"default_layer_name", "level_type", "number_of_layers_in_level",
"bottleneck", "filters", "kernal_size", "pool_size", "strides", "padding",
"default_activation", "route", "level/name", "is_output"
]
# pylint: disable=line-too-long
CSPDARKNET53 = {
"list_names": LISTNAMES,
"splits": {"backbone_split": 106,
"neck_split": 138},
"backbone": [
["ConvBN", None, 1, False, 32, None, 3, 1, "same", "mish", -1, 0, False],
["DarkRes", "csp", 1, True, 64, None, None, None, None, "mish", -1, 1, False],
["DarkRes", "csp", 2, False, 128, None, None, None, None, "mish", -1, 2, False],
["DarkRes", "csp", 8, False, 256, None, None, None, None, "mish", -1, 3, True],
["DarkRes", "csp", 8, False, 512, None, None, None, None, "mish", -1, 4, True],
["DarkRes", "csp", 4, False, 1024, None, None, None, None, "mish", -1, 5, True],
]
}
DARKNET53 = {
"list_names": LISTNAMES,
"splits": {"backbone_split": 76},
"backbone": [
["ConvBN", None, 1, False, 32, None, 3, 1, "same", "leaky", -1, 0, False],
["DarkRes", "residual", 1, True, 64, None, None, None, None, "leaky", -1, 1, False],
["DarkRes", "residual", 2, False, 128, None, None, None, None, "leaky", -1, 2, False],
["DarkRes", "residual", 8, False, 256, None, None, None, None, "leaky", -1, 3, True],
["DarkRes", "residual", 8, False, 512, None, None, None, None, "leaky", -1, 4, True],
["DarkRes", "residual", 4, False, 1024, None, None, None, None, "leaky", -1, 5, True],
]
}
CSPDARKNETTINY = {
"list_names": LISTNAMES,
"splits": {"backbone_split": 28},
"backbone": [
["ConvBN", None, 1, False, 32, None, 3, 2, "same", "leaky", -1, 0, False],
["ConvBN", None, 1, False, 64, None, 3, 2, "same", "leaky", -1, 1, False],
["CSPTiny", "csp_tiny", 1, False, 64, None, 3, 2, "same", "leaky", -1, 2, False],
["CSPTiny", "csp_tiny", 1, False, 128, None, 3, 2, "same", "leaky", -1, 3, False],
["CSPTiny", "csp_tiny", 1, False, 256, None, 3, 2, "same", "leaky", -1, 4, True],
["ConvBN", None, 1, False, 512, None, 3, 1, "same", "leaky", -1, 5, True],
]
}
DARKNETTINY = {
"list_names": LISTNAMES,
"splits": {"backbone_split": 14},
"backbone": [
["ConvBN", None, 1, False, 16, None, 3, 1, "same", "leaky", -1, 0, False],
["DarkTiny", "tiny", 1, True, 32, None, 3, 2, "same", "leaky", -1, 1, False],
["DarkTiny", "tiny", 1, True, 64, None, 3, 2, "same", "leaky", -1, 2, False],
["DarkTiny", "tiny", 1, False, 128, None, 3, 2, "same", "leaky", -1, 3, False],
["DarkTiny", "tiny", 1, False, 256, None, 3, 2, "same", "leaky", -1, 4, True],
["DarkTiny", "tiny", 1, False, 512, None, 3, 2, "same", "leaky", -1, 5, False],
["DarkTiny", "tiny", 1, False, 1024, None, 3, 1, "same", "leaky", -1, 5, True],
]
}
# pylint: enable=line-too-long
BACKBONES = {
"darknettiny": DARKNETTINY,
"darknet53": DARKNET53,
"cspdarknet53": CSPDARKNET53,
"cspdarknettiny": CSPDARKNETTINY
}
@tf.keras.utils.register_keras_serializable(package="yolo")
class Darknet(tf.keras.Model):
"""Darknet backbone."""
def __init__(
self,
model_id="darknet53",
input_specs=tf.keras.layers.InputSpec(shape=[None, None, None, 3]),
min_level=None,
max_level=5,
activation=None,
use_sync_bn=False,
norm_momentum=0.99,
norm_epsilon=0.001,
kernel_initializer="glorot_uniform",
kernel_regularizer=None,
bias_regularizer=None,
**kwargs):
layer_specs, splits = Darknet.get_model_config(model_id)
self._model_name = model_id
self._splits = splits
self._input_shape = input_specs
self._registry = LayerFactory()
# default layer look up
self._min_size = min_level
self._max_size = max_level
self._output_specs = None
self._kernel_initializer = kernel_initializer
self._bias_regularizer = bias_regularizer
self._norm_momentum = norm_momentum
self._norm_epislon = norm_epsilon
self._use_sync_bn = use_sync_bn
self._activation = activation
self._kernel_regularizer = kernel_regularizer
self._default_dict = {
"kernel_initializer": self._kernel_initializer,
"kernel_regularizer": self._kernel_regularizer,
"bias_regularizer": self._bias_regularizer,
"norm_momentum": self._norm_momentum,
"norm_epsilon": self._norm_epislon,
"use_sync_bn": self._use_sync_bn,
"activation": self._activation,
"name": None
}
inputs = tf.keras.layers.Input(shape=self._input_shape.shape[1:])
output = self._build_struct(layer_specs, inputs)
super().__init__(inputs=inputs, outputs=output, name=self._model_name)
@property
def input_specs(self):
return self._input_shape
@property
def output_specs(self):
return self._output_specs
@property
def splits(self):
return self._splits
def _build_struct(self, net, inputs):
endpoints = collections.OrderedDict()
stack_outputs = [inputs]
for i, config in enumerate(net):
if config.stack is None:
x = self._build_block(stack_outputs[config.route],
config,
name=f"{config.layer}_{i}")
stack_outputs.append(x)
elif config.stack == "residual":
x = self._residual_stack(stack_outputs[config.route],
config,
name=f"{config.layer}_{i}")
stack_outputs.append(x)
elif config.stack == "csp":
x = self._csp_stack(stack_outputs[config.route],
config,
name=f"{config.layer}_{i}")
stack_outputs.append(x)
elif config.stack == "csp_tiny":
x_pass, x = self._csp_tiny_stack(stack_outputs[config.route],
config, name=f"{config.layer}_{i}")
stack_outputs.append(x_pass)
elif config.stack == "tiny":
x = self._tiny_stack(stack_outputs[config.route],
config,
name=f"{config.layer}_{i}")
stack_outputs.append(x)
if (config.is_output and self._min_size is None):
endpoints[str(config.output_name)] = x
elif self._min_size is not None and config.output_name >= self._min_size and config.output_name <= self._max_size:
endpoints[str(config.output_name)] = x
self._output_specs = {l: endpoints[l].get_shape() for l in endpoints.keys()}
return endpoints
def _get_activation(self, activation):
if self._activation is None:
return activation
else:
return self._activation
def _csp_stack(self, inputs, config, name):
if config.bottleneck:
csp_filter_scale = 1
residual_filter_scale = 2
scale_filters = 1
else:
csp_filter_scale = 2
residual_filter_scale = 1
scale_filters = 2
self._default_dict["activation"] = self._get_activation(config.activation)
self._default_dict["name"] = f"{name}_csp_down"
x, x_route = nn_blocks.CSPRoute(filters=config.filters,
filter_scale=csp_filter_scale,
downsample=True,
**self._default_dict)(inputs)
for i in range(config.repetitions):
self._default_dict["name"] = f"{name}_{i}"
x = nn_blocks.DarkResidual(filters=config.filters // scale_filters,
filter_scale=residual_filter_scale,
**self._default_dict)(x)
self._default_dict["name"] = f"{name}_csp_connect"
output = nn_blocks.CSPConnect(filters=config.filters,
filter_scale=csp_filter_scale,
**self._default_dict)([x, x_route])
self._default_dict["activation"] = self._activation
self._default_dict["name"] = None
return output
def _csp_tiny_stack(self, inputs, config, name):
self._default_dict["activation"] = self._get_activation(config.activation)
self._default_dict["name"] = f"{name}_csp_tiny"
x, x_route = nn_blocks.CSPTiny(filters=config.filters,
**self._default_dict)(inputs)
self._default_dict["activation"] = self._activation
self._default_dict["name"] = None
return x, x_route
def _tiny_stack(self, inputs, config, name):
x = tf.keras.layers.MaxPool2D(pool_size=2,
strides=config.strides,
padding="same",
data_format=None,
name=f"{name}_tiny/pool")(inputs)
self._default_dict["activation"] = self._get_activation(config.activation)
self._default_dict["name"] = f"{name}_tiny/conv"
x = nn_blocks.ConvBN(
filters=config.filters,
kernel_size=(3, 3),
strides=(1, 1),
padding="same",
**self._default_dict)(
x)
self._default_dict["activation"] = self._activation
self._default_dict["name"] = None
return x
def _residual_stack(self, inputs, config, name):
self._default_dict["activation"] = self._get_activation(config.activation)
self._default_dict["name"] = f"{name}_residual_down"
x = nn_blocks.DarkResidual(filters=config.filters,
downsample=True,
**self._default_dict)(inputs)
for i in range(config.repetitions - 1):
self._default_dict["name"] = f"{name}_{i}"
x = nn_blocks.DarkResidual(filters=config.filters,
**self._default_dict)(x)
self._default_dict["activation"] = self._activation
self._default_dict["name"] = None
return x
def _build_block(self, inputs, config, name):
x = inputs
i = 0
self._default_dict["activation"] = self._get_activation(config.activation)
while i < config.repetitions:
self._default_dict["name"] = f"{name}_{i}"
layer = self._registry(config, self._default_dict)
x = layer(x)
i += 1
self._default_dict["activation"] = self._activation
self._default_dict["name"] = None
return x
@staticmethod
def get_model_config(name):
name = name.lower()
backbone = BACKBONES[name]["backbone"]
splits = BACKBONES[name]["splits"]
return build_block_specs(backbone), splits
@property
def model_id(self):
return self._model_name
@classmethod
def from_config(cls, config, custom_objects=None):
return cls(**config)
def get_config(self):
layer_config = {
"model_id": self._model_name,
"min_level": self._min_size,
"max_level": self._max_size,
"kernel_initializer": self._kernel_initializer,
"kernel_regularizer": self._kernel_regularizer,
"bias_regularizer": self._bias_regularizer,
"norm_momentum": self._norm_momentum,
"norm_epsilon": self._norm_epislon,
"use_sync_bn": self._use_sync_bn,
"activation": self._activation
}
return layer_config
@factory.register_backbone_builder("darknet")
def build_darknet(
input_specs: tf.keras.layers.InputSpec,
model_config,
l2_regularizer: tf.keras.regularizers.Regularizer = None) -> tf.keras.Model:
"""Builds darknet backbone."""
backbone_cfg = model_config.backbone.get()
norm_activation_config = model_config.norm_activation
model = Darknet(
model_id=backbone_cfg.model_id,
input_shape=input_specs,
activation=norm_activation_config.activation,
use_sync_bn=norm_activation_config.use_sync_bn,
norm_momentum=norm_activation_config.norm_momentum,
norm_epsilon=norm_activation_config.norm_epsilon,
kernel_regularizer=l2_regularizer)
return model
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for resnet."""
from absl.testing import parameterized
import numpy as np
import tensorflow as tf
from tensorflow.python.distribute import combinations
from tensorflow.python.distribute import strategy_combinations
from official.vision.beta.projects.yolo.modeling.backbones import darknet
class DarkNetTest(parameterized.TestCase, tf.test.TestCase):
@parameterized.parameters(
(224, "darknet53", 2, 1),
(224, "darknettiny", 1, 2),
(224, "cspdarknettiny", 1, 1),
(224, "cspdarknet53", 2, 1),
)
def test_network_creation(self, input_size, model_id,
endpoint_filter_scale, scale_final):
"""Test creation of ResNet family models."""
tf.keras.backend.set_image_data_format("channels_last")
network = darknet.Darknet(model_id=model_id, min_level=3, max_level=5)
self.assertEqual(network.model_id, model_id)
inputs = tf.keras.Input(shape=(input_size, input_size, 3), batch_size=1)
endpoints = network(inputs)
self.assertAllEqual(
[1, input_size / 2**3, input_size / 2**3, 128 * endpoint_filter_scale],
endpoints["3"].shape.as_list())
self.assertAllEqual(
[1, input_size / 2**4, input_size / 2**4, 256 * endpoint_filter_scale],
endpoints["4"].shape.as_list())
self.assertAllEqual([
1, input_size / 2**5, input_size / 2**5,
512 * endpoint_filter_scale * scale_final
], endpoints["5"].shape.as_list())
@combinations.generate(
combinations.combine(
strategy=[
strategy_combinations.cloud_tpu_strategy,
strategy_combinations.one_device_strategy_gpu,
],
use_sync_bn=[False, True],
))
def test_sync_bn_multiple_devices(self, strategy, use_sync_bn):
"""Test for sync bn on TPU and GPU devices."""
inputs = np.random.rand(1, 224, 224, 3)
tf.keras.backend.set_image_data_format("channels_last")
with strategy.scope():
network = darknet.Darknet(model_id="darknet53", min_size=3, max_size=5)
_ = network(inputs)
@parameterized.parameters(1, 3, 4)
def test_input_specs(self, input_dim):
"""Test different input feature dimensions."""
tf.keras.backend.set_image_data_format("channels_last")
input_specs = tf.keras.layers.InputSpec(shape=[None, None, None, input_dim])
network = darknet.Darknet(
model_id="darknet53", min_level=3, max_level=5, input_specs=input_specs)
inputs = tf.keras.Input(shape=(224, 224, input_dim), batch_size=1)
_ = network(inputs)
def test_serialize_deserialize(self):
# Create a network object that sets all of its config options.
kwargs = dict(
model_id="darknet53",
min_level=3,
max_level=5,
use_sync_bn=False,
activation="relu",
norm_momentum=0.99,
norm_epsilon=0.001,
kernel_initializer="VarianceScaling",
kernel_regularizer=None,
bias_regularizer=None,
)
network = darknet.Darknet(**kwargs)
expected_config = dict(kwargs)
self.assertEqual(network.get_config(), expected_config)
# Create another network object from the first object's config.
new_network = darknet.Darknet.from_config(network.get_config())
# Validate that the config can be forced to JSON.
_ = new_network.to_json()
# If the serialization was successful, the new config should match the old.
self.assertAllEqual(network.get_config(), new_network.get_config())
if __name__ == "__main__":
tf.test.main()
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains common building blocks for yolo neural networks."""
from typing import Callable, List
import tensorflow as tf
from official.modeling import tf_utils
@tf.keras.utils.register_keras_serializable(package="yolo")
class Identity(tf.keras.layers.Layer):
def call(self, inputs):
return inputs
@tf.keras.utils.register_keras_serializable(package="yolo")
class ConvBN(tf.keras.layers.Layer):
"""Modified Convolution layer to match that of the DarkNet Library.
The Layer is a standards combination of Conv BatchNorm Activation,
however, the use of bias in the conv is determined by the use of batch norm.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang
Chen, Jun-Wei Hsieh.
CSPNet: A New Backbone that can Enhance Learning Capability of CNN.
arXiv:1911.11929
"""
def __init__(self,
filters=1,
kernel_size=(1, 1),
strides=(1, 1),
padding="same",
dilation_rate=(1, 1),
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
use_bn=True,
use_sync_bn=False,
norm_momentum=0.99,
norm_epsilon=0.001,
activation="leaky",
leaky_alpha=0.1,
**kwargs):
"""Initializes ConvBN layer.
Args:
filters: integer for output depth, or the number of features to learn
kernel_size: integer or tuple for the shape of the weight matrix or kernel
to learn.
strides: integer of tuple how much to move the kernel after each kernel
use padding: string 'valid' or 'same', if same, then pad the image, else
do not.
padding: `str`, padding method for conv layers.
dilation_rate: tuple to indicate how much to modulate kernel weights and
how many pixels in a feature map to skip.
kernel_initializer: string to indicate which function to use to initialize
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization.
norm_momentum: float for moment to use for batch normalization
norm_epsilon: float for batch normalization epsilon
activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
leaky_alpha: float to use as alpha if activation function is leaky.
**kwargs: Keyword Arguments
"""
# convolution params
self._filters = filters
self._kernel_size = kernel_size
self._strides = strides
self._padding = padding
self._dilation_rate = dilation_rate
self._kernel_initializer = kernel_initializer
self._bias_initializer = bias_initializer
self._kernel_regularizer = kernel_regularizer
self._bias_regularizer = bias_regularizer
# batch normalization params
self._use_bn = use_bn
self._use_sync_bn = use_sync_bn
self._norm_moment = norm_momentum
self._norm_epsilon = norm_epsilon
if tf.keras.backend.image_data_format() == "channels_last":
# format: (batch_size, height, width, channels)
self._bn_axis = -1
else:
# format: (batch_size, channels, width, height)
self._bn_axis = 1
# activation params
self._activation = activation
self._leaky_alpha = leaky_alpha
super(ConvBN, self).__init__(**kwargs)
def build(self, input_shape):
use_bias = not self._use_bn
self.conv = tf.keras.layers.Conv2D(
filters=self._filters,
kernel_size=self._kernel_size,
strides=self._strides,
padding=self._padding,
dilation_rate=self._dilation_rate,
use_bias=use_bias,
kernel_initializer=self._kernel_initializer,
bias_initializer=self._bias_initializer,
kernel_regularizer=self._kernel_regularizer,
bias_regularizer=self._bias_regularizer)
if self._use_bn:
if self._use_sync_bn:
self.bn = tf.keras.layers.experimental.SyncBatchNormalization(
momentum=self._norm_moment,
epsilon=self._norm_epsilon,
axis=self._bn_axis)
else:
self.bn = tf.keras.layers.BatchNormalization(
momentum=self._norm_moment,
epsilon=self._norm_epsilon,
axis=self._bn_axis)
else:
self.bn = Identity()
if self._activation == "leaky":
self._activation_fn = tf.keras.layers.LeakyReLU(alpha=self._leaky_alpha)
elif self._activation == "mish":
self._activation_fn = lambda x: x * tf.math.tanh(tf.math.softplus(x))
else:
self._activation_fn = tf_utils.get_activation(self._activation)
def call(self, x):
x = self.conv(x)
x = self.bn(x)
x = self._activation_fn(x)
return x
def get_config(self):
# used to store/share parameters to reconstruct the model
layer_config = {
"filters": self._filters,
"kernel_size": self._kernel_size,
"strides": self._strides,
"padding": self._padding,
"dilation_rate": self._dilation_rate,
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"bias_regularizer": self._bias_regularizer,
"kernel_regularizer": self._kernel_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_moment": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"activation": self._activation,
"leaky_alpha": self._leaky_alpha
}
layer_config.update(super(ConvBN, self).get_config())
return layer_config
def __repr__(self):
return repr(self.get_config())
@tf.keras.utils.register_keras_serializable(package="yolo")
class DarkResidual(tf.keras.layers.Layer):
"""DarkNet block with Residual connection for Yolo v3 Backbone.
"""
def __init__(self,
filters=1,
filter_scale=2,
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
use_bn=True,
use_sync_bn=False,
norm_momentum=0.99,
norm_epsilon=0.001,
activation="leaky",
leaky_alpha=0.1,
sc_activation="linear",
downsample=False,
**kwargs):
"""Initializes DarkResidual.
Args:
filters: integer for output depth, or the number of features to learn.
filter_scale: `int`, scale factor for number of filters.
kernel_initializer: string to indicate which function to use to initialize
weights
bias_initializer: string to indicate which function to use to initialize
bias
kernel_regularizer: string to indicate which function to use to
regularizer weights
bias_regularizer: string to indicate which function to use to regularizer
bias
use_bn: boolean for whether to use batch normalization
use_sync_bn: boolean for whether sync batch normalization.
norm_momentum: float for moment to use for batch normalization
norm_epsilon: float for batch normalization epsilon
activation: string for activation function to use in conv layers.
leaky_alpha: float to use as alpha if activation function is leaky
sc_activation: string for activation function to use in layer
downsample: boolean for if image input is larger than layer output, set
downsample to True so the dimensions are forced to match
**kwargs: Keyword Arguments
"""
# downsample
self._downsample = downsample
# ConvBN params
self._filters = filters
self._filter_scale = filter_scale
self._kernel_initializer = kernel_initializer
self._bias_initializer = bias_initializer
self._bias_regularizer = bias_regularizer
self._use_bn = use_bn
self._use_sync_bn = use_sync_bn
self._kernel_regularizer = kernel_regularizer
# normal params
self._norm_moment = norm_momentum
self._norm_epsilon = norm_epsilon
# activation params
self._conv_activation = activation
self._leaky_alpha = leaky_alpha
self._sc_activation = sc_activation
super().__init__(**kwargs)
def build(self, input_shape):
self._dark_conv_args = {
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"bias_regularizer": self._bias_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_momentum": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"activation": self._conv_activation,
"kernel_regularizer": self._kernel_regularizer,
"leaky_alpha": self._leaky_alpha
}
if self._downsample:
self._dconv = ConvBN(
filters=self._filters,
kernel_size=(3, 3),
strides=(2, 2),
padding="same",
**self._dark_conv_args)
else:
self._dconv = Identity()
self._conv1 = ConvBN(
filters=self._filters // self._filter_scale,
kernel_size=(1, 1),
strides=(1, 1),
padding="same",
**self._dark_conv_args)
self._conv2 = ConvBN(
filters=self._filters,
kernel_size=(3, 3),
strides=(1, 1),
padding="same",
**self._dark_conv_args)
self._shortcut = tf.keras.layers.Add()
if self._sc_activation == "leaky":
self._activation_fn = tf.keras.layers.LeakyReLU(
alpha=self._leaky_alpha)
elif self._sc_activation == "mish":
self._activation_fn = lambda x: x * tf.math.tanh(tf.math.softplus(x))
else:
self._activation_fn = tf_utils.get_activation(self._sc_activation)
super().build(input_shape)
def call(self, inputs):
shortcut = self._dconv(inputs)
x = self._conv1(shortcut)
x = self._conv2(x)
x = self._shortcut([x, shortcut])
return self._activation_fn(x)
def get_config(self):
# used to store/share parameters to reconstruct the model
layer_config = {
"filters": self._filters,
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"kernel_regularizer": self._kernel_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_moment": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"activation": self._conv_activation,
"leaky_alpha": self._leaky_alpha,
"sc_activation": self._sc_activation,
"downsample": self._downsample
}
layer_config.update(super().get_config())
return layer_config
@tf.keras.utils.register_keras_serializable(package="yolo")
class CSPTiny(tf.keras.layers.Layer):
"""A Small size convolution block proposed in the CSPNet.
The layer uses shortcuts, routing(concatnation), and feature grouping
in order to improve gradient variablity and allow for high efficency, low
power residual learning for small networtf.keras.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang
Chen, Jun-Wei Hsieh
CSPNet: A New Backbone that can Enhance Learning Capability of CNN.
arXiv:1911.11929
"""
def __init__(self,
filters=1,
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
use_bn=True,
use_sync_bn=False,
group_id=1,
groups=2,
norm_momentum=0.99,
norm_epsilon=0.001,
activation="leaky",
downsample=True,
leaky_alpha=0.1,
**kwargs):
"""Initializes CSPTiny.
Args:
filters: integer for output depth, or the number of features to learn
kernel_initializer: string to indicate which function to use to initialize
weights
bias_initializer: string to indicate which function to use to initialize
bias
kernel_regularizer: string to indicate which function to use to
regularizer weights
bias_regularizer: string to indicate which function to use to regularizer
bias
use_bn: boolean for whether to use batch normalization
use_sync_bn: boolean for whether sync batch normalization statistics of
all batch norm layers to the models global statistics (across all input
batches)
group_id: integer for which group of features to pass through the csp tiny
stack.
groups: integer for how many splits there should be in the convolution
feature stack output
norm_momentum: float for moment to use for batch normalization
norm_epsilon: float for batch normalization epsilon
activation: string or None for activation function to use in layer,
if None activation is replaced by linear
downsample: boolean for if image input is larger than layer output, set
downsample to True so the dimensions are forced to match
leaky_alpha: float to use as alpha if activation function is leaky
**kwargs: Keyword Arguments
"""
# ConvBN params
self._filters = filters
self._kernel_initializer = kernel_initializer
self._bias_initializer = bias_initializer
self._bias_regularizer = bias_regularizer
self._use_bn = use_bn
self._use_sync_bn = use_sync_bn
self._kernel_regularizer = kernel_regularizer
self._groups = groups
self._group_id = group_id
self._downsample = downsample
# normal params
self._norm_moment = norm_momentum
self._norm_epsilon = norm_epsilon
# activation params
self._conv_activation = activation
self._leaky_alpha = leaky_alpha
super().__init__(**kwargs)
def build(self, input_shape):
self._dark_conv_args = {
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"bias_regularizer": self._bias_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_momentum": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"activation": self._conv_activation,
"kernel_regularizer": self._kernel_regularizer,
"leaky_alpha": self._leaky_alpha
}
self._convlayer1 = ConvBN(
filters=self._filters,
kernel_size=(3, 3),
strides=(1, 1),
padding="same",
**self._dark_conv_args)
self._convlayer2 = ConvBN(
filters=self._filters // 2,
kernel_size=(3, 3),
strides=(1, 1),
padding="same",
kernel_initializer=self._kernel_initializer,
bias_initializer=self._bias_initializer,
bias_regularizer=self._bias_regularizer,
kernel_regularizer=self._kernel_regularizer,
use_bn=self._use_bn,
use_sync_bn=self._use_sync_bn,
norm_momentum=self._norm_moment,
norm_epsilon=self._norm_epsilon,
activation=self._conv_activation,
leaky_alpha=self._leaky_alpha)
self._convlayer3 = ConvBN(
filters=self._filters // 2,
kernel_size=(3, 3),
strides=(1, 1),
padding="same",
**self._dark_conv_args)
self._convlayer4 = ConvBN(
filters=self._filters,
kernel_size=(1, 1),
strides=(1, 1),
padding="same",
**self._dark_conv_args)
self._maxpool = tf.keras.layers.MaxPool2D(
pool_size=2, strides=2, padding="same", data_format=None)
super().build(input_shape)
def call(self, inputs):
x1 = self._convlayer1(inputs)
x1_group = tf.split(x1, self._groups, axis=-1)[self._group_id]
x2 = self._convlayer2(x1_group) # grouping
x3 = self._convlayer3(x2)
x4 = tf.concat([x3, x2], axis=-1) # csp partial using grouping
x5 = self._convlayer4(x4)
x = tf.concat([x1, x5], axis=-1) # csp connect
if self._downsample:
x = self._maxpool(x)
return x, x5
def get_config(self):
# used to store/share parameters to reconsturct the model
layer_config = {
"filters": self._filters,
"strides": self._strides,
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"kernel_regularizer": self._kernel_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_moment": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"activation": self._conv_activation,
"leaky_alpha": self._leaky_alpha,
"sc_activation": self._sc_activation,
}
layer_config.update(super().get_config())
return layer_config
@tf.keras.utils.register_keras_serializable(package="yolo")
class CSPRoute(tf.keras.layers.Layer):
"""Down sampling layer to take the place of down sampleing.
It is applied in Residual networks. This is the first of 2 layers needed to
convert any Residual Network model to a CSPNet. At the start of a new level
change, this CSPRoute layer creates a learned identity that will act as a
cross stage connection, that is used to inform the inputs to the next stage.
It is called cross stage partial because the number of filters required in
every intermitent Residual layer is reduced by half. The sister layer will
take the partial generated by this layer and concatnate it with the output of
the final residual layer in the stack to create a fully feature level output.
This concatnation merges the partial blocks of 2 levels as input to the next
allowing the gradients of each level to be more unique, and reducing the
number of parameters required by each level by 50% while keeping accuracy
consistent.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang
Chen, Jun-Wei Hsieh.
CSPNet: A New Backbone that can Enhance Learning Capability of CNN.
arXiv:1911.11929
"""
def __init__(self,
filters,
filter_scale=2,
activation="mish",
downsample=True,
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
use_bn=True,
use_sync_bn=False,
norm_momentum=0.99,
norm_epsilon=0.001,
**kwargs):
"""Initializes CSPRoute.
Args:
filters: integer for output depth, or the number of features to learn
filter_scale: integer dicating (filters//2) or the number of filters in
the partial feature stack.
activation: string for activation function to use in layer
downsample: down_sample the input.
kernel_initializer: string to indicate which function to use to initialize
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization.
norm_momentum: float for moment to use for batch normalization
norm_epsilon: float for batch normalization epsilon
**kwargs: Keyword Arguments
"""
super().__init__(**kwargs)
# Layer params.
self._filters = filters
self._filter_scale = filter_scale
self._activation = activation
# Convoultion params.
self._kernel_initializer = kernel_initializer
self._bias_initializer = bias_initializer
self._kernel_regularizer = kernel_regularizer
self._bias_regularizer = bias_regularizer
self._use_bn = use_bn
self._use_sync_bn = use_sync_bn
self._norm_moment = norm_momentum
self._norm_epsilon = norm_epsilon
self._downsample = downsample
def build(self, input_shape):
self._dark_conv_args = {
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"bias_regularizer": self._bias_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_momentum": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"activation": self._activation,
"kernel_regularizer": self._kernel_regularizer,
}
if self._downsample:
self._conv1 = ConvBN(filters=self._filters,
kernel_size=(3, 3),
strides=(2, 2),
**self._dark_conv_args)
else:
self._conv1 = ConvBN(filters=self._filters,
kernel_size=(3, 3),
strides=(1, 1),
**self._dark_conv_args)
self._conv2 = ConvBN(filters=self._filters // self._filter_scale,
kernel_size=(1, 1),
strides=(1, 1),
**self._dark_conv_args)
self._conv3 = ConvBN(filters=self._filters // self._filter_scale,
kernel_size=(1, 1),
strides=(1, 1),
**self._dark_conv_args)
def call(self, inputs):
x = self._conv1(inputs)
y = self._conv2(x)
x = self._conv3(x)
return (x, y)
@tf.keras.utils.register_keras_serializable(package="yolo")
class CSPConnect(tf.keras.layers.Layer):
"""Sister Layer to the CSPRoute layer.
Merges the partial feature stacks generated by the CSPDownsampling layer,
and the finaly output of the residual stack. Suggested in the CSPNet paper.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang
Chen, Jun-Wei Hsieh.
CSPNet: A New Backbone that can Enhance Learning Capability of CNN.
arXiv:1911.11929
"""
def __init__(self,
filters,
filter_scale=2,
activation="mish",
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
use_bn=True,
use_sync_bn=False,
norm_momentum=0.99,
norm_epsilon=0.001,
**kwargs):
"""Initializes CSPConnect.
Args:
filters: integer for output depth, or the number of features to learn.
filter_scale: integer dicating (filters//2) or the number of filters in
the partial feature stack.
activation: string for activation function to use in layer.
kernel_initializer: string to indicate which function to use to initialize
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization.
norm_momentum: float for moment to use for batch normalization
norm_epsilon: float for batch normalization epsilon
**kwargs: Keyword Arguments
"""
super().__init__(**kwargs)
# layer params.
self._filters = filters
self._filter_scale = filter_scale
self._activation = activation
# Convoultion params.
self._kernel_initializer = kernel_initializer
self._bias_initializer = bias_initializer
self._kernel_regularizer = kernel_regularizer
self._bias_regularizer = bias_regularizer
self._use_bn = use_bn
self._use_sync_bn = use_sync_bn
self._norm_moment = norm_momentum
self._norm_epsilon = norm_epsilon
def build(self, input_shape):
self._dark_conv_args = {
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"bias_regularizer": self._bias_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_momentum": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"activation": self._activation,
"kernel_regularizer": self._kernel_regularizer,
}
self._conv1 = ConvBN(filters=self._filters // self._filter_scale,
kernel_size=(1, 1),
strides=(1, 1),
**self._dark_conv_args)
self._concat = tf.keras.layers.Concatenate(axis=-1)
self._conv2 = ConvBN(filters=self._filters,
kernel_size=(1, 1),
strides=(1, 1),
**self._dark_conv_args)
def call(self, inputs):
x_prev, x_csp = inputs
x = self._conv1(x_prev)
x = self._concat([x, x_csp])
x = self._conv2(x)
return x
class CSPStack(tf.keras.layers.Layer):
"""CSP full stack.
Combines the route and the connect in case you dont want to just quickly wrap
an existing callable or list of layers to make it a cross stage partial.
Added for ease of use. you should be able to wrap any layer stack with a CSP
independent of wether it belongs to the Darknet family. if filter_scale = 2,
then the blocks in the stack passed into the the CSP stack should also have
filters = filters/filter_scale.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang
Chen, Jun-Wei Hsieh
CSPNet: A New Backbone that can Enhance Learning Capability of CNN.
arXiv:1911.11929
"""
def __init__(self,
filters,
model_to_wrap=None,
filter_scale=2,
activation="mish",
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
downsample=True,
use_bn=True,
use_sync_bn=False,
norm_momentum=0.99,
norm_epsilon=0.001,
**kwargs):
"""Initializes CSPStack.
Args:
filters: integer for output depth, or the number of features to learn.
model_to_wrap: callable Model or a list of callable objects that will
process the output of CSPRoute, and be input into CSPConnect. List will
be called sequentially.
filter_scale: integer dicating (filters//2) or the number of filters in
the partial feature stack.
activation: string for activation function to use in layer.
kernel_initializer: string to indicate which function to use to initialize
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
downsample: down_sample the input.
use_bn: boolean for whether to use batch normalization
use_sync_bn: boolean for whether sync batch normalization.
norm_momentum: float for moment to use for batch normalization
norm_epsilon: float for batch normalization epsilon
**kwargs: Keyword Arguments
"""
super().__init__(**kwargs)
# Layer params.
self._filters = filters
self._filter_scale = filter_scale
self._activation = activation
self._downsample = downsample
# Convoultion params.
self._kernel_initializer = kernel_initializer
self._bias_initializer = bias_initializer
self._kernel_regularizer = kernel_regularizer
self._bias_regularizer = bias_regularizer
self._use_bn = use_bn
self._use_sync_bn = use_sync_bn
self._norm_moment = norm_momentum
self._norm_epsilon = norm_epsilon
if model_to_wrap is not None:
if isinstance(model_to_wrap, Callable):
self._model_to_wrap = [model_to_wrap]
elif isinstance(model_to_wrap, List):
self._model_to_wrap = model_to_wrap
else:
raise ValueError("The input to the CSPStack must be a list of layers"
"that we can iterate through, or \n a callable")
else:
self._model_to_wrap = []
def build(self, input_shape):
self._dark_conv_args = {
"filters": self._filters,
"filter_scale": self._filter_scale,
"activation": self._activation,
"kernel_initializer": self._kernel_initializer,
"bias_initializer": self._bias_initializer,
"bias_regularizer": self._bias_regularizer,
"use_bn": self._use_bn,
"use_sync_bn": self._use_sync_bn,
"norm_momentum": self._norm_moment,
"norm_epsilon": self._norm_epsilon,
"kernel_regularizer": self._kernel_regularizer,
}
self._route = CSPRoute(downsample=self._downsample, **self._dark_conv_args)
self._connect = CSPConnect(**self._dark_conv_args)
return
def call(self, inputs):
x, x_route = self._route(inputs)
for layer in self._model_to_wrap:
x = layer(x)
x = self._connect([x, x_route])
return x
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
from absl.testing import parameterized
import numpy as np
import tensorflow as tf
from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
class CSPConnectTest(tf.test.TestCase, parameterized.TestCase):
@parameterized.named_parameters(("same", 224, 224, 64, 1),
("downsample", 224, 224, 64, 2))
def test_pass_through(self, width, height, filters, mod):
x = tf.keras.Input(shape=(width, height, filters))
test_layer = nn_blocks.CSPRoute(filters=filters, filter_scale=mod)
test_layer2 = nn_blocks.CSPConnect(filters=filters, filter_scale=mod)
outx, px = test_layer(x)
outx = test_layer2([outx, px])
print(outx)
print(outx.shape.as_list())
self.assertAllEqual(
outx.shape.as_list(),
[None, np.ceil(width // 2),
np.ceil(height // 2), (filters)])
@parameterized.named_parameters(("same", 224, 224, 64, 1),
("downsample", 224, 224, 128, 2))
def test_gradient_pass_though(self, filters, width, height, mod):
loss = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.SGD()
test_layer = nn_blocks.CSPRoute(filters, filter_scale=mod)
path_layer = nn_blocks.CSPConnect(filters, filter_scale=mod)
init = tf.random_normal_initializer()
x = tf.Variable(
initial_value=init(shape=(1, width, height, filters), dtype=tf.float32))
y = tf.Variable(initial_value=init(shape=(1, int(np.ceil(width // 2)),
int(np.ceil(height // 2)),
filters),
dtype=tf.float32))
with tf.GradientTape() as tape:
x_hat, x_prev = test_layer(x)
x_hat = path_layer([x_hat, x_prev])
grad_loss = loss(x_hat, y)
grad = tape.gradient(grad_loss, test_layer.trainable_variables)
optimizer.apply_gradients(zip(grad, test_layer.trainable_variables))
self.assertNotIn(None, grad)
class CSPRouteTest(tf.test.TestCase, parameterized.TestCase):
@parameterized.named_parameters(("same", 224, 224, 64, 1),
("downsample", 224, 224, 64, 2))
def test_pass_through(self, width, height, filters, mod):
x = tf.keras.Input(shape=(width, height, filters))
test_layer = nn_blocks.CSPRoute(filters=filters, filter_scale=mod)
outx, _ = test_layer(x)
print(outx)
print(outx.shape.as_list())
self.assertAllEqual(
outx.shape.as_list(),
[None, np.ceil(width // 2),
np.ceil(height // 2), (filters / mod)])
@parameterized.named_parameters(("same", 224, 224, 64, 1),
("downsample", 224, 224, 128, 2))
def test_gradient_pass_though(self, filters, width, height, mod):
loss = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.SGD()
test_layer = nn_blocks.CSPRoute(filters, filter_scale=mod)
path_layer = nn_blocks.CSPConnect(filters, filter_scale=mod)
init = tf.random_normal_initializer()
x = tf.Variable(
initial_value=init(shape=(1, width, height, filters), dtype=tf.float32))
y = tf.Variable(initial_value=init(shape=(1, int(np.ceil(width // 2)),
int(np.ceil(height // 2)),
filters),
dtype=tf.float32))
with tf.GradientTape() as tape:
x_hat, x_prev = test_layer(x)
x_hat = path_layer([x_hat, x_prev])
grad_loss = loss(x_hat, y)
grad = tape.gradient(grad_loss, test_layer.trainable_variables)
optimizer.apply_gradients(zip(grad, test_layer.trainable_variables))
self.assertNotIn(None, grad)
class CSPStackTest(tf.test.TestCase, parameterized.TestCase):
def build_layer(
self, layer_type, filters, filter_scale, count, stack_type, downsample):
if stack_type is not None:
layers = []
if layer_type == "residual":
for _ in range(count):
layers.append(
nn_blocks.DarkResidual(
filters=filters // filter_scale, filter_scale=filter_scale))
else:
for _ in range(count):
layers.append(nn_blocks.ConvBN(filters=filters))
if stack_type == "model":
layers = tf.keras.Sequential(layers=layers)
else:
layers = None
stack = nn_blocks.CSPStack(
filters=filters,
filter_scale=filter_scale,
downsample=downsample,
model_to_wrap=layers)
return stack
@parameterized.named_parameters(
("no_stack", 224, 224, 64, 2, "residual", None, 0, True),
("residual_stack", 224, 224, 64, 2, "residual", "list", 2, True),
("conv_stack", 224, 224, 64, 2, "conv", "list", 3, False),
("callable_no_scale", 224, 224, 64, 1, "residual", "model", 5, False))
def test_pass_through(self, width, height, filters, mod, layer_type,
stack_type, count, downsample):
x = tf.keras.Input(shape=(width, height, filters))
test_layer = self.build_layer(layer_type, filters, mod, count, stack_type,
downsample)
outx = test_layer(x)
print(outx)
print(outx.shape.as_list())
if downsample:
self.assertAllEqual(outx.shape.as_list(),
[None, width // 2, height // 2, filters])
else:
self.assertAllEqual(outx.shape.as_list(), [None, width, height, filters])
@parameterized.named_parameters(
("no_stack", 224, 224, 64, 2, "residual", None, 0, True),
("residual_stack", 224, 224, 64, 2, "residual", "list", 2, True),
("conv_stack", 224, 224, 64, 2, "conv", "list", 3, False),
("callable_no_scale", 224, 224, 64, 1, "residual", "model", 5, False))
def test_gradient_pass_though(self, width, height, filters, mod, layer_type,
stack_type, count, downsample):
loss = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.SGD()
init = tf.random_normal_initializer()
x = tf.Variable(
initial_value=init(shape=(1, width, height, filters), dtype=tf.float32))
if not downsample:
y = tf.Variable(
initial_value=init(
shape=(1, width, height, filters), dtype=tf.float32))
else:
y = tf.Variable(
initial_value=init(
shape=(1, width // 2, height // 2, filters), dtype=tf.float32))
test_layer = self.build_layer(layer_type, filters, mod, count, stack_type,
downsample)
with tf.GradientTape() as tape:
x_hat = test_layer(x)
grad_loss = loss(x_hat, y)
grad = tape.gradient(grad_loss, test_layer.trainable_variables)
optimizer.apply_gradients(zip(grad, test_layer.trainable_variables))
self.assertNotIn(None, grad)
class ConvBNTest(tf.test.TestCase, parameterized.TestCase):
@parameterized.named_parameters(
("valid", (3, 3), "valid", (1, 1)), ("same", (3, 3), "same", (1, 1)),
("downsample", (3, 3), "same", (2, 2)), ("test", (1, 1), "valid", (1, 1)))
def test_pass_through(self, kernel_size, padding, strides):
if padding == "same":
pad_const = 1
else:
pad_const = 0
x = tf.keras.Input(shape=(224, 224, 3))
test_layer = nn_blocks.ConvBN(
filters=64,
kernel_size=kernel_size,
padding=padding,
strides=strides,
trainable=False)
outx = test_layer(x)
print(outx.shape.as_list())
test = [
None,
int((224 - kernel_size[0] + (2 * pad_const)) / strides[0] + 1),
int((224 - kernel_size[1] + (2 * pad_const)) / strides[1] + 1), 64
]
print(test)
self.assertAllEqual(outx.shape.as_list(), test)
@parameterized.named_parameters(("filters", 3))
def test_gradient_pass_though(self, filters):
loss = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.SGD()
with tf.device("/CPU:0"):
test_layer = nn_blocks.ConvBN(filters, kernel_size=(3, 3), padding="same")
init = tf.random_normal_initializer()
x = tf.Variable(initial_value=init(shape=(1, 224, 224,
3), dtype=tf.float32))
y = tf.Variable(
initial_value=init(shape=(1, 224, 224, filters), dtype=tf.float32))
with tf.GradientTape() as tape:
x_hat = test_layer(x)
grad_loss = loss(x_hat, y)
grad = tape.gradient(grad_loss, test_layer.trainable_variables)
optimizer.apply_gradients(zip(grad, test_layer.trainable_variables))
self.assertNotIn(None, grad)
class DarkResidualTest(tf.test.TestCase, parameterized.TestCase):
@parameterized.named_parameters(("same", 224, 224, 64, False),
("downsample", 223, 223, 32, True),
("oddball", 223, 223, 32, False))
def test_pass_through(self, width, height, filters, downsample):
mod = 1
if downsample:
mod = 2
x = tf.keras.Input(shape=(width, height, filters))
test_layer = nn_blocks.DarkResidual(filters=filters, downsample=downsample)
outx = test_layer(x)
print(outx)
print(outx.shape.as_list())
self.assertAllEqual(
outx.shape.as_list(),
[None, np.ceil(width / mod),
np.ceil(height / mod), filters])
@parameterized.named_parameters(("same", 64, 224, 224, False),
("downsample", 32, 223, 223, True),
("oddball", 32, 223, 223, False))
def test_gradient_pass_though(self, filters, width, height, downsample):
loss = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.SGD()
test_layer = nn_blocks.DarkResidual(filters, downsample=downsample)
if downsample:
mod = 2
else:
mod = 1
init = tf.random_normal_initializer()
x = tf.Variable(
initial_value=init(shape=(1, width, height, filters), dtype=tf.float32))
y = tf.Variable(initial_value=init(shape=(1, int(np.ceil(width / mod)),
int(np.ceil(height / mod)),
filters),
dtype=tf.float32))
with tf.GradientTape() as tape:
x_hat = test_layer(x)
grad_loss = loss(x_hat, y)
grad = tape.gradient(grad_loss, test_layer.trainable_variables)
optimizer.apply_gradients(zip(grad, test_layer.trainable_variables))
self.assertNotIn(None, grad)
if __name__ == "__main__":
tf.test.main()
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Image classification task definition."""
import tensorflow as tf
from official.core import input_reader
from official.core import task_factory
from official.vision.beta.dataloaders import classification_input
from official.vision.beta.projects.yolo.configs import darknet_classification as exp_cfg
from official.vision.beta.projects.yolo.dataloaders import classification_tfds_decoder as cli
from official.vision.beta.tasks import image_classification
@task_factory.register_task_cls(exp_cfg.ImageClassificationTask)
class ImageClassificationTask(image_classification.ImageClassificationTask):
"""A task for image classification."""
def build_inputs(self, params, input_context=None):
"""Builds classification input."""
num_classes = self.task_config.model.num_classes
input_size = self.task_config.model.input_size
if params.tfds_name:
decoder = cli.Decoder()
else:
decoder = classification_input.Decoder()
parser = classification_input.Parser(
output_size=input_size[:2],
num_classes=num_classes,
dtype=params.dtype)
reader = input_reader.InputReader(
params,
dataset_fn=tf.data.TFRecordDataset,
decoder_fn=decoder.decode,
parser_fn=parser.parse_fn(params.is_training))
dataset = reader.read(input_context=input_context)
return dataset
def train_step(self, inputs, model, optimizer, metrics=None):
"""Does forward and backward.
Args:
inputs: a dictionary of input tensors.
model: the model, forward pass definition.
optimizer: the optimizer for this training step.
metrics: a nested structure of metrics objects.
Returns:
A dictionary of logs.
"""
features, labels = inputs
if self.task_config.losses.one_hot:
labels = tf.one_hot(labels, self.task_config.model.num_classes)
num_replicas = tf.distribute.get_strategy().num_replicas_in_sync
with tf.GradientTape() as tape:
outputs = model(features, training=True)
# Casting output layer as float32 is necessary when mixed_precision is
# mixed_float16 or mixed_bfloat16 to ensure output is casted as float32.
outputs = tf.nest.map_structure(
lambda x: tf.cast(x, tf.float32), outputs)
# Computes per-replica loss.
loss = self.build_losses(
model_outputs=outputs, labels=labels, aux_losses=model.losses)
# Scales loss as the default gradients allreduce performs sum inside the
# optimizer.
scaled_loss = loss / num_replicas
# For mixed_precision policy, when LossScaleOptimizer is used, loss is
# scaled for numerical stability.
if isinstance(
optimizer, tf.keras.mixed_precision.experimental.LossScaleOptimizer):
scaled_loss = optimizer.get_scaled_loss(scaled_loss)
tvars = model.trainable_variables
grads = tape.gradient(scaled_loss, tvars)
# Scales back gradient before apply_gradients when LossScaleOptimizer is
# used.
if isinstance(
optimizer, tf.keras.mixed_precision.experimental.LossScaleOptimizer):
grads = optimizer.get_unscaled_gradients(grads)
# Apply gradient clipping.
if self.task_config.gradient_clip_norm > 0:
grads, _ = tf.clip_by_global_norm(
grads, self.task_config.gradient_clip_norm)
optimizer.apply_gradients(list(zip(grads, tvars)))
logs = {self.loss: loss}
if metrics:
self.process_metrics(metrics, labels, outputs)
logs.update({m.name: m.result() for m in metrics})
elif model.compiled_metrics:
self.process_compiled_metrics(model.compiled_metrics, labels, outputs)
logs.update({m.name: m.result() for m in model.metrics})
return logs
# Lint as: python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""TensorFlow Model Garden Vision training driver."""
from absl import app
from absl import flags
import gin
from official.common import distribute_utils
from official.common import flags as tfm_flags
from official.core import task_factory
from official.core import train_lib
from official.core import train_utils
from official.modeling import performance
from official.vision.beta.projects.yolo.common import registry_imports # pylint: disable=unused-import
FLAGS = flags.FLAGS
'''
python3 -m official.vision.beta.projects.yolo.train --mode=train_and_eval --experiment=darknet_classification --model_dir=training_dir --config_file=official/vision/beta/projects/yolo/configs/experiments/darknet53_tfds.yaml
'''
def main(_):
gin.parse_config_files_and_bindings(FLAGS.gin_file, FLAGS.gin_params)
print(FLAGS.experiment)
params = train_utils.parse_configuration(FLAGS)
model_dir = FLAGS.model_dir
if 'train' in FLAGS.mode:
# Pure eval modes do not output yaml files. Otherwise continuous eval job
# may race against the train job for writing the same file.
train_utils.serialize_config(params, model_dir)
# Sets mixed_precision policy. Using 'mixed_float16' or 'mixed_bfloat16'
# can have significant impact on model speeds by utilizing float16 in case of
# GPUs, and bfloat16 in the case of TPUs. loss_scale takes effect only when
# dtype is float16
if params.runtime.mixed_precision_dtype:
performance.set_mixed_precision_policy(params.runtime.mixed_precision_dtype,
params.runtime.loss_scale)
distribution_strategy = distribute_utils.get_distribution_strategy(
distribution_strategy=params.runtime.distribution_strategy,
all_reduce_alg=params.runtime.all_reduce_alg,
num_gpus=params.runtime.num_gpus,
tpu_address=params.runtime.tpu)
with distribution_strategy.scope():
task = task_factory.get_task(params.task, logging_dir=model_dir)
train_lib.run_experiment(
distribution_strategy=distribution_strategy,
task=task,
mode=FLAGS.mode,
params=params,
model_dir=model_dir)
if __name__ == '__main__':
tfm_flags.define_flags()
app.run(main)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment