Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
64f16d61
Commit
64f16d61
authored
Jun 14, 2021
by
Akhil Chinnakotla
Browse files
Grammar & Spelling Fixes
parent
c02980f4
Changes
10
Hide whitespace changes
Inline
Side-by-side
Showing
10 changed files
with
196 additions
and
197 deletions
+196
-197
official/vision/beta/projects/yolo/README.md
official/vision/beta/projects/yolo/README.md
+22
-22
official/vision/beta/projects/yolo/configs/darknet_classification.py
...sion/beta/projects/yolo/configs/darknet_classification.py
+1
-1
official/vision/beta/projects/yolo/modeling/backbones/darknet.py
...l/vision/beta/projects/yolo/modeling/backbones/darknet.py
+8
-7
official/vision/beta/projects/yolo/modeling/backbones/darknet_test.py
...ion/beta/projects/yolo/modeling/backbones/darknet_test.py
+1
-1
official/vision/beta/projects/yolo/modeling/decoders/yolo_decoder.py
...sion/beta/projects/yolo/modeling/decoders/yolo_decoder.py
+8
-6
official/vision/beta/projects/yolo/modeling/heads/yolo_head_test.py
...ision/beta/projects/yolo/modeling/heads/yolo_head_test.py
+1
-2
official/vision/beta/projects/yolo/modeling/layers/nn_blocks.py
...al/vision/beta/projects/yolo/modeling/layers/nn_blocks.py
+61
-62
official/vision/beta/projects/yolo/ops/box_ops.py
official/vision/beta/projects/yolo/ops/box_ops.py
+32
-33
official/vision/beta/projects/yolo/ops/math_ops.py
official/vision/beta/projects/yolo/ops/math_ops.py
+19
-19
official/vision/beta/projects/yolo/ops/nms_ops.py
official/vision/beta/projects/yolo/ops/nms_ops.py
+43
-44
No files found.
official/vision/beta/projects/yolo/README.md
View file @
64f16d61
...
@@ -14,30 +14,30 @@ repository.
...
@@ -14,30 +14,30 @@ repository.
## Description
## Description
Y
olo
v1 the original implementation was released in 2015 providing a ground
Y
OLO
v1 the original implementation was released in 2015 providing a ground
breaking
breaking
algorithm that would quickly process images
,
and locate objects in a
algorithm that would quickly process images and locate objects in a
single pass through the detector. The original implementation
based
used a
single pass through the detector. The original implementation used a
backbone derived from state of the art object classifier of the time, like
backbone derived from state of the art object classifier
s
of the time, like
[
GoogLeNet
](
https://arxiv.org/abs/1409.4842
)
and
[
GoogLeNet
](
https://arxiv.org/abs/1409.4842
)
and
[
VGG
](
https://arxiv.org/abs/1409.1556
)
. More attention was given to the novel
[
VGG
](
https://arxiv.org/abs/1409.1556
)
. More attention was given to the novel
Y
olo
Detection head that allowed for Object Detection with a single pass of an
Y
OLO
Detection head that allowed for Object Detection with a single pass of an
image. Though limited, the network could predict up to 90 bounding boxes per
image. Though limited, the network could predict up to 90 bounding boxes per
image, and was tested for about 80 classes per box. Also, the model c
ould
only
image, and was tested for about 80 classes per box. Also, the model c
an
only
make prediction at one scale. These attributes caused
yolo
v1 to be more
make prediction
s
at one scale. These attributes caused
YOLO
v1 to be more
limited
,
and less versatile, so as the year passed, the Developers continued to
limited and less versatile, so as the year passed, the Developers continued to
update and develop this model.
update and develop this model.
Y
olo
v3 and v4 serve as the most up to date and capable versions of the Y
olo
Y
OLO
v3 and v4 serve as the most up to date and capable versions of the Y
OLO
network group. Th
ese
model uses a custom backbone called Darknet53 that uses
network group. Th
is
model uses a custom backbone called Darknet53 that uses
knowledge gained from the ResNet paper to improve its predictions. The new
knowledge gained from the ResNet paper to improve its predictions. The new
backbone
backbone
also allows for objects to be detected at multiple scales. As for the
also allows for objects to be detected at multiple scales. As for the
new detection head,
new detection head,
the model now predicts the bounding boxes using a set of
the model now predicts the bounding boxes using a set of
anchor box priors (Anchor
anchor box priors (Anchor
Boxes) as suggestions.
The m
ultiscale predictions in
Boxes) as suggestions.
M
ultiscale predictions in
combination with Anchor boxes allow
combination with the Anchor boxes allows for the network to make up to 1000
for the network to make up to 1000 object predictions on a single image. Finally,
object predictions on a single image. Finally, the new loss function forces the
the new loss function forces the network to make better predictions by using Intersection
network to make better prediction by using Intersection Over Union (IOU) to
Over Union (IOU) to inform the model's confidence rather than relying on the mean
inform the model's confidence rather than relying on the mean squared error for
squared error for the entire output.
the entire output.
## Authors
## Authors
...
@@ -56,9 +56,9 @@ the entire output.
...
@@ -56,9 +56,9 @@ the entire output.
## Our Goal
## Our Goal
Our goal with this model conversion is to provide implementation
s
of the
Our goal with this model conversion is to provide implementation of the
Backbone
Backbone
and Y
olo
Head. We have built the model in such a way that the Y
olo
and Y
OLO
Head. We have built the model in such a way that the Y
OLO head could be
head could be
connected to a new, more powerful backbone if a person chose to.
connected to a new, more powerful backbone if a person chose to.
## Models in the library
## Models in the library
...
...
official/vision/beta/projects/yolo/configs/darknet_classification.py
View file @
64f16d61
...
@@ -35,7 +35,7 @@ class ImageClassificationModel(hyperparams.Config):
...
@@ -35,7 +35,7 @@ class ImageClassificationModel(hyperparams.Config):
type
=
'darknet'
,
darknet
=
backbones
.
Darknet
())
type
=
'darknet'
,
darknet
=
backbones
.
Darknet
())
dropout_rate
:
float
=
0.0
dropout_rate
:
float
=
0.0
norm_activation
:
common
.
NormActivation
=
common
.
NormActivation
()
norm_activation
:
common
.
NormActivation
=
common
.
NormActivation
()
# Adds a BatchNormalization layer pre-GlobalAveragePooling in classification
# Adds a Batch
Normalization layer pre-GlobalAveragePooling in classification
.
add_head_batch_norm
:
bool
=
False
add_head_batch_norm
:
bool
=
False
...
...
official/vision/beta/projects/yolo/modeling/backbones/darknet.py
View file @
64f16d61
...
@@ -16,7 +16,7 @@
...
@@ -16,7 +16,7 @@
"""Contains definitions of Darknet Backbone Networks.
"""Contains definitions of Darknet Backbone Networks.
The models are inspired by ResNet
,
and CSPNet
Th
es
e models are inspired by ResNet and CSPNet
.
Residual networks (ResNets) were proposed in:
Residual networks (ResNets) were proposed in:
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
...
@@ -49,7 +49,7 @@ from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
...
@@ -49,7 +49,7 @@ from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
class
BlockConfig
:
class
BlockConfig
:
"""
"""
C
lass to store layer config to make code more readable
This is a c
lass to store layer config to make code more readable
.
"""
"""
def
__init__
(
self
,
layer
,
stack
,
reps
,
bottleneck
,
filters
,
pool_size
,
def
__init__
(
self
,
layer
,
stack
,
reps
,
bottleneck
,
filters
,
pool_size
,
...
@@ -69,7 +69,7 @@ class BlockConfig:
...
@@ -69,7 +69,7 @@ class BlockConfig:
padding: An `int` for the padding to apply to layers in this stack.
padding: An `int` for the padding to apply to layers in this stack.
activation: A `str` for the activation to use for this stack.
activation: A `str` for the activation to use for this stack.
route: An `int` for the level to route from to get the next input.
route: An `int` for the level to route from to get the next input.
dilation_rate: An `int` for the scale used in di
a
lated Darknet.
dilation_rate: An `int` for the scale used in dilated Darknet.
output_name: A `str` for the name to use for this output.
output_name: A `str` for the name to use for this output.
is_output: A `bool` for whether this layer is an output in the default
is_output: A `bool` for whether this layer is an output in the default
model.
model.
...
@@ -99,9 +99,10 @@ def build_block_specs(config):
...
@@ -99,9 +99,10 @@ def build_block_specs(config):
class
LayerBuilder
:
class
LayerBuilder
:
"""
"""
class for quick look up of default layers used by darknet to
This is a class that is used for quick look up of default layers used
connect, introduce or exit a level. Used in place of an if condition
by darknet to connect, introduce or exit a level. Used in place of an
or switch to make adding new layers easier and to reduce redundant code
if condition or switch to make adding new layers easier and to reduce
redundant code.
"""
"""
def
__init__
(
self
):
def
__init__
(
self
):
...
@@ -377,7 +378,7 @@ BACKBONES = {
...
@@ -377,7 +378,7 @@ BACKBONES = {
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
Darknet
(
tf
.
keras
.
Model
):
class
Darknet
(
tf
.
keras
.
Model
):
""" The Darknet backbone architecture """
""" The Darknet backbone architecture
.
"""
def
__init__
(
def
__init__
(
self
,
self
,
...
...
official/vision/beta/projects/yolo/modeling/backbones/darknet_test.py
View file @
64f16d61
...
@@ -13,7 +13,7 @@
...
@@ -13,7 +13,7 @@
# limitations under the License.
# limitations under the License.
# Lint as: python3
# Lint as: python3
"""Tests for
yolo
."""
"""Tests for
YOLO
."""
from
absl.testing
import
parameterized
from
absl.testing
import
parameterized
import
numpy
as
np
import
numpy
as
np
...
...
official/vision/beta/projects/yolo/modeling/decoders/yolo_decoder.py
View file @
64f16d61
...
@@ -13,7 +13,7 @@
...
@@ -13,7 +13,7 @@
# limitations under the License.
# limitations under the License.
# Lint as: python3
# Lint as: python3
"""Feature Pyramid Network and Path Aggregation variants used in YOLO"""
"""Feature Pyramid Network and Path Aggregation variants used in YOLO
.
"""
import
tensorflow
as
tf
import
tensorflow
as
tf
from
official.vision.beta.projects.yolo.modeling.layers
import
nn_blocks
from
official.vision.beta.projects.yolo.modeling.layers
import
nn_blocks
...
@@ -23,8 +23,10 @@ from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
...
@@ -23,8 +23,10 @@ from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
class
_IdentityRoute
(
tf
.
keras
.
layers
.
Layer
):
class
_IdentityRoute
(
tf
.
keras
.
layers
.
Layer
):
def
__init__
(
self
,
**
kwargs
):
def
__init__
(
self
,
**
kwargs
):
"""Private class to mirror the outputs of blocks in nn_blocks for an easier
"""
programatic generation of the feature pyramid network"""
Private class to mirror the outputs of blocks in nn_blocks for an easier
programatic generation of the feature pyramid network.
"""
super
().
__init__
(
**
kwargs
)
super
().
__init__
(
**
kwargs
)
...
@@ -125,7 +127,7 @@ class YoloFPN(tf.keras.layers.Layer):
...
@@ -125,7 +127,7 @@ class YoloFPN(tf.keras.layers.Layer):
# directly connect to an input path and process it
# directly connect to an input path and process it
self
.
preprocessors
=
dict
()
self
.
preprocessors
=
dict
()
# resample an input and merge it with the output of another path
# resample an input and merge it with the output of another path
# inorder to aggregate backbone outputs
# in
order to aggregate backbone outputs
self
.
resamples
=
dict
()
self
.
resamples
=
dict
()
# set of convoltion layers and upsample layers that are used to
# set of convoltion layers and upsample layers that are used to
# prepare the FPN processors for output
# prepare the FPN processors for output
...
@@ -214,7 +216,7 @@ class YoloPAN(tf.keras.layers.Layer):
...
@@ -214,7 +216,7 @@ class YoloPAN(tf.keras.layers.Layer):
kernel_initializer: kernel_initializer for convolutional layers.
kernel_initializer: kernel_initializer for convolutional layers.
kernel_regularizer: tf.keras.regularizers.Regularizer object for Conv2D.
kernel_regularizer: tf.keras.regularizers.Regularizer object for Conv2D.
bias_regularizer: tf.keras.regularizers.Regularizer object for Conv2d.
bias_regularizer: tf.keras.regularizers.Regularizer object for Conv2d.
fpn_input: `bool`, for whether the input into this fu
c
ntion is an FPN or
fpn_input: `bool`, for whether the input into this fun
c
tion is an FPN or
a backbone.
a backbone.
fpn_filter_scale: `int`, scaling factor for the FPN filters.
fpn_filter_scale: `int`, scaling factor for the FPN filters.
**kwargs: keyword arguments to be passed.
**kwargs: keyword arguments to be passed.
...
@@ -268,7 +270,7 @@ class YoloPAN(tf.keras.layers.Layer):
...
@@ -268,7 +270,7 @@ class YoloPAN(tf.keras.layers.Layer):
# directly connect to an input path and process it
# directly connect to an input path and process it
self
.
preprocessors
=
dict
()
self
.
preprocessors
=
dict
()
# resample an input and merge it with the output of another path
# resample an input and merge it with the output of another path
# inorder to aggregate backbone outputs
# in
order to aggregate backbone outputs
self
.
resamples
=
dict
()
self
.
resamples
=
dict
()
# FPN will reverse the key process order for the backbone, so we need
# FPN will reverse the key process order for the backbone, so we need
...
...
official/vision/beta/projects/yolo/modeling/heads/yolo_head_test.py
View file @
64f16d61
...
@@ -13,7 +13,7 @@
...
@@ -13,7 +13,7 @@
# limitations under the License.
# limitations under the License.
# Lint as: python3
# Lint as: python3
"""Tests for
yolo
heads."""
"""Tests for
YOLO
heads."""
# Import libraries
# Import libraries
from
absl.testing
import
parameterized
from
absl.testing
import
parameterized
...
@@ -44,7 +44,6 @@ class YoloDecoderTest(parameterized.TestCase, tf.test.TestCase):
...
@@ -44,7 +44,6 @@ class YoloDecoderTest(parameterized.TestCase, tf.test.TestCase):
inputs
[
key
]
=
tf
.
ones
(
input_shape
[
key
],
dtype
=
tf
.
float32
)
inputs
[
key
]
=
tf
.
ones
(
input_shape
[
key
],
dtype
=
tf
.
float32
)
endpoints
=
head
(
inputs
)
endpoints
=
head
(
inputs
)
# print(endpoints)
for
key
in
endpoints
.
keys
():
for
key
in
endpoints
.
keys
():
expected_input_shape
=
input_shape
[
key
]
expected_input_shape
=
input_shape
[
key
]
...
...
official/vision/beta/projects/yolo/modeling/layers/nn_blocks.py
View file @
64f16d61
...
@@ -14,7 +14,7 @@
...
@@ -14,7 +14,7 @@
# Lint as: python3
# Lint as: python3
"""Contains common building blocks for
yolo
neural networks."""
"""Contains common building blocks for
YOLO
neural networks."""
from
typing
import
Callable
,
List
from
typing
import
Callable
,
List
import
tensorflow
as
tf
import
tensorflow
as
tf
from
official.modeling
import
tf_utils
from
official.modeling
import
tf_utils
...
@@ -35,9 +35,9 @@ class Identity(tf.keras.layers.Layer):
...
@@ -35,9 +35,9 @@ class Identity(tf.keras.layers.Layer):
class
ConvBN
(
tf
.
keras
.
layers
.
Layer
):
class
ConvBN
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
Modified Convolution layer to match that of the Darknet Library.
Modified Convolution layer to match that of the Darknet Library.
The Layer is a standard
s
combination of Conv BatchNorm Activation,
The Layer is a standard combination of Conv BatchNorm Activation,
however, the use of bias in the
c
onv is determined by the use of
batch
however, the use of bias in the
C
onv is determined by the use of
normalization.
batch
normalization.
Cross Stage Partial networks (CSPNets) were proposed in:
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
Ping-Yang Chen, Jun-Wei Hsieh
Ping-Yang Chen, Jun-Wei Hsieh
...
@@ -71,16 +71,16 @@ class ConvBN(tf.keras.layers.Layer):
...
@@ -71,16 +71,16 @@ class ConvBN(tf.keras.layers.Layer):
use.
use.
padding: string 'valid' or 'same', if same, then pad the image, else do
padding: string 'valid' or 'same', if same, then pad the image, else do
not.
not.
di
a
ltion_rate: tuple to indicate how much to modulate kernel weights and
dil
a
tion_rate: tuple to indicate how much to modulate kernel weights and
how many pixels in a feature map to skip.
how many pixels in a feature map to skip.
kernel_initializer: string to indicate which function to use to initialize
kernel_initializer: string to indicate which function to use to initialize
weights.
weights.
bias_initializer: string to indicate which function to use to initialize
bias_initializer: string to indicate which function to use to initialize
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias_regularizer: string to indicate which function to use to regularizer
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
of all batch norm layers to the models global statistics
...
@@ -191,7 +191,7 @@ class ConvBN(tf.keras.layers.Layer):
...
@@ -191,7 +191,7 @@ class ConvBN(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
DarkResidual
(
tf
.
keras
.
layers
.
Layer
):
class
DarkResidual
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
Darknet block with Residual connection for Y
olo
v3 Backbone
Darknet block with Residual connection for Y
OLO
v3 Backbone
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
...
@@ -228,8 +228,6 @@ class DarkResidual(tf.keras.layers.Layer):
...
@@ -228,8 +228,6 @@ class DarkResidual(tf.keras.layers.Layer):
(across all input batches).
(across all input batches).
norm_momentum: float for moment to use for batch normalization.
norm_momentum: float for moment to use for batch normalization.
norm_epsilon: float for batch normalization epsilon.
norm_epsilon: float for batch normalization epsilon.
conv_activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
leaky_alpha: float to use as alpha if activation function is leaky.
leaky_alpha: float to use as alpha if activation function is leaky.
sc_activation: string for activation function to use in layer.
sc_activation: string for activation function to use in layer.
downsample: boolean for if image input is larger than layer output, set
downsample: boolean for if image input is larger than layer output, set
...
@@ -352,10 +350,10 @@ class DarkResidual(tf.keras.layers.Layer):
...
@@ -352,10 +350,10 @@ class DarkResidual(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
CSPTiny
(
tf
.
keras
.
layers
.
Layer
):
class
CSPTiny
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
A
S
mall size convolution block proposed in the CSPNet. The layer uses
A
s
mall size convolution block proposed in the CSPNet. The layer uses
shortcuts,
shortcuts,
routing(concatnation), and feature grouping in order to improve
routing(concat
e
nation), and feature grouping in order to improve
gradient
gradient
variablity and allow for high efficency, low power residual learning
variab
i
lity and allow for high effic
i
ency, low power residual learning
for small
for small networtf.kera
s.
network
s.
Cross Stage Partial networks (CSPNets) were proposed in:
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
Ping-Yang Chen, Jun-Wei Hsieh
Ping-Yang Chen, Jun-Wei Hsieh
...
@@ -387,11 +385,11 @@ class CSPTiny(tf.keras.layers.Layer):
...
@@ -387,11 +385,11 @@ class CSPTiny(tf.keras.layers.Layer):
weights.
weights.
bias_initializer: string to indicate which function to use to initialize
bias_initializer: string to indicate which function to use to initialize
bias.
bias.
use_bn: boolean for whether to use batch normalization.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias_regularizer: string to indicate which function to use to regularizer
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
of all batch norm layers to the models global statistics
(across all input batches).
(across all input batches).
...
@@ -401,12 +399,12 @@ class CSPTiny(tf.keras.layers.Layer):
...
@@ -401,12 +399,12 @@ class CSPTiny(tf.keras.layers.Layer):
feature stack output.
feature stack output.
norm_momentum: float for moment to use for batch normalization.
norm_momentum: float for moment to use for batch normalization.
norm_epsilon: float for batch normalization epsilon.
norm_epsilon: float for batch normalization epsilon.
conv_activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
leaky_alpha: float to use as alpha if activation function is leaky.
sc_activation: string for activation function to use in layer.
downsample: boolean for if image input is larger than layer output, set
downsample: boolean for if image input is larger than layer output, set
downsample to True so the dimensions are forced to match.
downsample to True so the dimensions are forced to match.
leaky_alpha: float to use as alpha if activation function is leaky.
sc_activation: string for activation function to use in layer.
conv_activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
**kwargs: Keyword Arguments.
**kwargs: Keyword Arguments.
"""
"""
...
@@ -505,18 +503,18 @@ class CSPTiny(tf.keras.layers.Layer):
...
@@ -505,18 +503,18 @@ class CSPTiny(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
CSPRoute
(
tf
.
keras
.
layers
.
Layer
):
class
CSPRoute
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
Down sampling layer to take the place of down sampl
e
ing done in Residual
Down sampling layer to take the place of down sampling done in Residual
networks. This is the first of 2 layers needed to convert any Residual Network
networks. This is the first of 2 layers needed to convert any Residual Network
model to a CSPNet. At the start of a new level change, this CSPRoute layer
model to a CSPNet. At the start of a new level change, this CSPRoute layer
creates a learned identity that will act as a cross stage connection
,
creates a learned identity that will act as a cross stage connection
that
that
is used to inform the inputs to the next stage.
It
is called cross stage
is used to inform the inputs to the next stage.
This
is called cross stage
partial because the number of filters required in every intermitent
R
esidual
partial because the number of filters required in every intermit
t
ent
r
esidual
layer is reduced by half. The sister layer will take the partial generated by
layer is reduced by half. The sister layer will take the partial generated by
this layer and concatnate it with the output of the final residual layer in
this layer and concat
e
nate it with the output of the final residual layer in
the
the
stack to create a fully feature level output. This concatnation merges the
stack to create a fully feature level output. This concat
e
nation merges the
partial blocks of 2 levels as input to the next allowing the gradients of each
partial blocks of 2 levels as input to the next allowing the gradients of each
level to be more unique, and reducing the number of parameters required by
level to be more unique, and reducing the number of parameters required by
each
each
level by 50% while keeping accuracy consistent.
level by 50% while keeping accuracy consistent.
Cross Stage Partial networks (CSPNets) were proposed in:
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
...
@@ -544,24 +542,24 @@ class CSPRoute(tf.keras.layers.Layer):
...
@@ -544,24 +542,24 @@ class CSPRoute(tf.keras.layers.Layer):
"""
"""
Args:
Args:
filters: integer for output depth, or the number of features to learn
filters: integer for output depth, or the number of features to learn
filter_scale: integer dicating (filters//2) or the number of filters in
filter_scale: integer dic
t
ating (filters//2) or the number of filters in
the partial feature stack.
the partial feature stack.
downsample: down_sample the input.
activation: string for activation function to use in layer.
activation: string for activation function to use in layer.
kernel_initializer: string to indicate which function to use to
kernel_initializer: string to indicate which function to use to
initialize weights.
initialize weights.
bias_initializer: string to indicate which function to use to initialize
bias_initializer: string to indicate which function to use to initialize
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias_regularizer: string to indicate which function to use to regularizer
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
of all batch norm layers to the models global statistics
(across all input batches).
(across all input batches).
norm_momentum: float for moment to use for batch normalization.
norm_momentum: float for moment to use for batch normalization.
norm_epsilon: float for batch normalization epsilon.
norm_epsilon: float for batch normalization epsilon.
downsample: down_sample the input.
**kwargs: Keyword Arguments.
**kwargs: Keyword Arguments.
"""
"""
...
@@ -571,7 +569,7 @@ class CSPRoute(tf.keras.layers.Layer):
...
@@ -571,7 +569,7 @@ class CSPRoute(tf.keras.layers.Layer):
self
.
_filter_scale
=
filter_scale
self
.
_filter_scale
=
filter_scale
self
.
_activation
=
activation
self
.
_activation
=
activation
# convo
u
ltion params
# convol
u
tion params
self
.
_kernel_initializer
=
kernel_initializer
self
.
_kernel_initializer
=
kernel_initializer
self
.
_bias_initializer
=
bias_initializer
self
.
_bias_initializer
=
bias_initializer
self
.
_kernel_regularizer
=
kernel_regularizer
self
.
_kernel_regularizer
=
kernel_regularizer
...
@@ -638,7 +636,7 @@ class CSPRoute(tf.keras.layers.Layer):
...
@@ -638,7 +636,7 @@ class CSPRoute(tf.keras.layers.Layer):
class
CSPConnect
(
tf
.
keras
.
layers
.
Layer
):
class
CSPConnect
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
Sister Layer to the CSPRoute layer. Merges the partial feature stacks
Sister Layer to the CSPRoute layer. Merges the partial feature stacks
generated by the CSPDownsampling layer, and the final
y
output of the
generated by the CSPDownsampling layer, and the final output of the
residual stack. Suggested in the CSPNet paper.
residual stack. Suggested in the CSPNet paper.
Cross Stage Partial networks (CSPNets) were proposed in:
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
...
@@ -675,10 +673,10 @@ class CSPConnect(tf.keras.layers.Layer):
...
@@ -675,10 +673,10 @@ class CSPConnect(tf.keras.layers.Layer):
weights.
weights.
bias_initializer: string to indicate which function to use to initialize
bias_initializer: string to indicate which function to use to initialize
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias_regularizer: string to indicate which function to use to regularizer
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global
of all batch norm layers to the models global
...
@@ -750,13 +748,13 @@ class CSPConnect(tf.keras.layers.Layer):
...
@@ -750,13 +748,13 @@ class CSPConnect(tf.keras.layers.Layer):
class
CSPStack
(
tf
.
keras
.
layers
.
Layer
):
class
CSPStack
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
CSP full stack, combines the route and the connect in case you dont want to
CSP full stack, combines the route and the connect in case you don
'
t want to
j
s
ut quickly wrap an existing callable or list of layers to
ju
s
t quickly wrap an existing callable or list of layers to
make it a cross
make it a cross
stage partial. Added for ease of use. you should be able
stage partial. Added for ease of use. you should be able
to wrap any layer
to wrap any layer
stack with a CSP independent of wether it belongs
stack with a CSP independent of w
h
ether it belongs
to the Darknet family. If
to the Darknet family. if
filter_scale = 2, then the blocks in the stack
filter_scale = 2, then the blocks in the stack
passed into the the CSP stack
passed into the the CSP stack
should also have filters = filters/filter_scale
should also have filters = filters/filter_scale
Cross Stage Partial networks
Cross Stage Partial networks
(CSPNets) were proposed in:
(CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
Ping-Yang Chen, Jun-Wei Hsieh
Ping-Yang Chen, Jun-Wei Hsieh
...
@@ -781,11 +779,10 @@ class CSPStack(tf.keras.layers.Layer):
...
@@ -781,11 +779,10 @@ class CSPStack(tf.keras.layers.Layer):
**
kwargs
):
**
kwargs
):
"""
"""
Args:
Args:
filters: integer for output depth, or the number of features to learn.
model_to_wrap: callable Model or a list of callable objects that will
model_to_wrap: callable Model or a list of callable objects that will
process the output of CSPRoute, and be input into CSPConnect.
process the output of CSPRoute, and be input into CSPConnect.
list will be called sequentially.
list will be called sequentially.
downsample: down_sample the input.
filters: integer for output depth, or the number of features to learn.
filter_scale: integer dicating (filters//2) or the number of filters in
filter_scale: integer dicating (filters//2) or the number of filters in
the partial feature stack.
the partial feature stack.
activation: string for activation function to use in layer.
activation: string for activation function to use in layer.
...
@@ -793,10 +790,11 @@ class CSPStack(tf.keras.layers.Layer):
...
@@ -793,10 +790,11 @@ class CSPStack(tf.keras.layers.Layer):
weights.
weights.
bias_initializer: string to indicate which function to use to initialize
bias_initializer: string to indicate which function to use to initialize
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias_regularizer: string to indicate which function to use to regularizer
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
downsample: down_sample the input.
use_bn: boolean for whether to use batch normalization.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
of all batch norm layers to the models global statistics
...
@@ -891,10 +889,10 @@ class PathAggregationBlock(tf.keras.layers.Layer):
...
@@ -891,10 +889,10 @@ class PathAggregationBlock(tf.keras.layers.Layer):
weights.
weights.
bias_initializer: string to indicate which function to use to initialize
bias_initializer: string to indicate which function to use to initialize
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias_regularizer: string to indicate which function to use to regularizer
bias.
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
of all batch norm layers to the models global statistics
...
@@ -905,8 +903,8 @@ class PathAggregationBlock(tf.keras.layers.Layer):
...
@@ -905,8 +903,8 @@ class PathAggregationBlock(tf.keras.layers.Layer):
activation: string or None for activation function to use in layer,
activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
if None activation is replaced by linear.
leaky_alpha: float to use as alpha if activation function is leaky.
leaky_alpha: float to use as alpha if activation function is leaky.
downsample: `bool` for whe
h
ter to down
w
ample and merge.
downsample: `bool` for whet
h
er to down
s
ample and merge.
upsample: `bool` for whe
h
ter to upsample and merge.
upsample: `bool` for whet
h
er to upsample and merge.
upsample_size: `int` how much to upsample in order to match shapes.
upsample_size: `int` how much to upsample in order to match shapes.
**kwargs: Keyword Arguments.
**kwargs: Keyword Arguments.
"""
"""
...
@@ -1050,7 +1048,7 @@ class PathAggregationBlock(tf.keras.layers.Layer):
...
@@ -1050,7 +1048,7 @@ class PathAggregationBlock(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
SPP
(
tf
.
keras
.
layers
.
Layer
):
class
SPP
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
a
non-agregated SPP layer that uses Pooling to gain more performance
A
non-ag
g
regated SPP layer that uses Pooling to gain more performance
.
"""
"""
def
__init__
(
self
,
sizes
,
**
kwargs
):
def
__init__
(
self
,
sizes
,
**
kwargs
):
...
@@ -1090,7 +1088,7 @@ class SAM(tf.keras.layers.Layer):
...
@@ -1090,7 +1088,7 @@ class SAM(tf.keras.layers.Layer):
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
i
mplementation of the Spatial Attention Model (SAM)
I
mplementation of the Spatial Attention Model (SAM)
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
...
@@ -1167,7 +1165,7 @@ class CAM(tf.keras.layers.Layer):
...
@@ -1167,7 +1165,7 @@ class CAM(tf.keras.layers.Layer):
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
i
mplementation of the Channel Attention Model (CAM)
I
mplementation of the Channel Attention Model (CAM)
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
...
@@ -1253,7 +1251,7 @@ class CBAM(tf.keras.layers.Layer):
...
@@ -1253,7 +1251,7 @@ class CBAM(tf.keras.layers.Layer):
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
i
mplementation of the Convolution Block Attention Module (CBAM)
I
mplementation of the Convolution Block Attention Module (CBAM)
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
...
@@ -1321,8 +1319,9 @@ class CBAM(tf.keras.layers.Layer):
...
@@ -1321,8 +1319,9 @@ class CBAM(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
DarkRouteProcess
(
tf
.
keras
.
layers
.
Layer
):
class
DarkRouteProcess
(
tf
.
keras
.
layers
.
Layer
):
"""
"""
process darknet outputs and connect back bone to head more generalizably
Processes darknet outputs and connects the backbone to the head for more
Abstracts repetition of DarkConv objects that is common in YOLO.
generalizability and abstracts the repetition of DarkConv objects that is
common in YOLO.
It is used like the following:
It is used like the following:
...
@@ -1357,18 +1356,18 @@ class DarkRouteProcess(tf.keras.layers.Layer):
...
@@ -1357,18 +1356,18 @@ class DarkRouteProcess(tf.keras.layers.Layer):
filters: the number of filters to be used in all subsequent layers
filters: the number of filters to be used in all subsequent layers
filters should be the depth of the tensor input into this layer,
filters should be the depth of the tensor input into this layer,
as no downsampling can be done within this layer object.
as no downsampling can be done within this layer object.
repetitions: number of times to repeat the processi
g
n nodes
repetitions: number of times to repeat the processin
g
nodes
for tiny: 1 repition, no spp allowed
for tiny: 1 rep
et
ition, no spp allowed
for spp: insert_spp = True, and allow for 3+ repetitions
for spp: insert_spp = True, and allow for 3+ repetitions
for regular: insert_spp = False, and allow for 3+ repetitions.
for regular: insert_spp = False, and allow for 3+ repetitions.
insert_spp: bool if true add the spatial pyramid pooling layer.
insert_spp: bool if true add the spatial pyramid pooling layer.
kernel_initializer: method to use to initializ
a
kernel weights.
kernel_initializer: method to use to initializ
e
kernel weights.
bias_initializer: method to use to initialize the bias of the conv
bias_initializer: method to use to initialize the bias of the conv
layers.
layers.
norm_momentum: batch norm parameter see TensorFlow documentation.
norm_momentum: batch norm parameter see TensorFlow documentation.
norm_epsilon: batch norm parameter see TensorFlow documentation.
norm_epsilon: batch norm parameter see TensorFlow documentation.
activation: activation function to use in processing.
activation: activation function to use in processing.
leaky_alpha: if leaky ac
i
tivation function, the alpha to use in
leaky_alpha: if leaky activation function, the alpha to use in
processing the relu input.
processing the relu input.
Returns:
Returns:
...
...
official/vision/beta/projects/yolo/ops/box_ops.py
View file @
64f16d61
...
@@ -4,13 +4,13 @@ import math
...
@@ -4,13 +4,13 @@ import math
def
yxyx_to_xcycwh
(
box
:
tf
.
Tensor
):
def
yxyx_to_xcycwh
(
box
:
tf
.
Tensor
):
"""Converts boxes from ymin, xmin, ymax, xmax to x_center, y_center, width,
"""Converts boxes from ymin, xmin, ymax, xmax to x_center, y_center, width,
height.
height.
Args:
Args:
box: any `Tensor` whose last dimension is 4 representing the coordinates of
box: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes in ymin, xmin, ymax, xmax.
boxes in ymin, xmin, ymax, xmax.
Returns:
Returns:
box: a `Tensor` whose shape is the same as `box` in new format.
box: a `Tensor` whose shape is the same as `box` in new format.
"""
"""
...
@@ -52,13 +52,13 @@ def _xcycwh_to_yxyx(box: tf.Tensor, scale):
...
@@ -52,13 +52,13 @@ def _xcycwh_to_yxyx(box: tf.Tensor, scale):
def
xcycwh_to_yxyx
(
box
:
tf
.
Tensor
,
darknet
=
False
):
def
xcycwh_to_yxyx
(
box
:
tf
.
Tensor
,
darknet
=
False
):
"""Converts boxes from x_center, y_center, width, height to ymin, xmin, ymax,
"""Converts boxes from x_center, y_center, width, height to ymin, xmin, ymax,
xmax.
xmax.
Args:
Args:
box: any `Tensor` whose last dimension is 4 representing the coordinates of
box: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes in x_center, y_center, width, height.
boxes in x_center, y_center, width, height.
Returns:
Returns:
box: a `Tensor` whose shape is the same as `box` in new format.
box: a `Tensor` whose shape is the same as `box` in new format.
"""
"""
...
@@ -75,9 +75,9 @@ def intersect_and_union(box1, box2, yxyx=False):
...
@@ -75,9 +75,9 @@ def intersect_and_union(box1, box2, yxyx=False):
"""Calculates the intersection and union between box1 and box2.
"""Calculates the intersection and union between box1 and box2.
Args:
Args:
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
yxyx: a `bool` indicating whether the input box is of the format x_center
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
y_center, width, height or y_min, x_min, y_max, x_max.
...
@@ -109,15 +109,15 @@ def smallest_encompassing_box(box1, box2, yxyx=False):
...
@@ -109,15 +109,15 @@ def smallest_encompassing_box(box1, box2, yxyx=False):
box1 and box2.
box1 and box2.
Args:
Args:
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
yxyx: a `bool` indicating whether the input box is of the format x_center
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
y_center, width, height or y_min, x_min, y_max, x_max.
Returns:
Returns:
box_c: a `Tensor` whose last dimension is 4 representing the coordinates of
box_c: a `Tensor` whose last dimension is 4 representing the coordinates of
boxes, the return format is y_min, x_min, y_max, x_max if yxyx is set to
boxes, the return format is y_min, x_min, y_max, x_max if yxyx is set to
to True. In other words it will match the input format.
to True. In other words it will match the input format.
"""
"""
...
@@ -145,9 +145,9 @@ def compute_iou(box1, box2, yxyx=False):
...
@@ -145,9 +145,9 @@ def compute_iou(box1, box2, yxyx=False):
"""Calculates the intersection over union between box1 and box2.
"""Calculates the intersection over union between box1 and box2.
Args:
Args:
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
yxyx: a `bool` indicating whether the input box is of the format x_center
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
y_center, width, height or y_min, x_min, y_max, x_max.
...
@@ -167,13 +167,13 @@ def compute_giou(box1, box2, yxyx=False, darknet=False):
...
@@ -167,13 +167,13 @@ def compute_giou(box1, box2, yxyx=False, darknet=False):
"""Calculates the General intersection over union between box1 and box2.
"""Calculates the General intersection over union between box1 and box2.
Args:
Args:
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
yxyx: a `bool` indicating whether the input box is of the format x_center
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
y_center, width, height or y_min, x_min, y_max, x_max.
darknet: a `bool` indicating whether the calling function is the
yolo
darknet: a `bool` indicating whether the calling function is the
YOLO
darknet loss.
darknet loss.
Returns:
Returns:
...
@@ -208,15 +208,15 @@ def compute_diou(box1, box2, beta=1.0, yxyx=False, darknet=False):
...
@@ -208,15 +208,15 @@ def compute_diou(box1, box2, beta=1.0, yxyx=False, darknet=False):
"""Calculates the distance intersection over union between box1 and box2.
"""Calculates the distance intersection over union between box1 and box2.
Args:
Args:
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
beta: a `float` indicating the amount to scale the distance iou
beta: a `float` indicating the amount to scale the distance iou
regularization term.
regularization term.
yxyx: a `bool` indicating whether the input box is of the format x_center
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
y_center, width, height or y_min, x_min, y_max, x_max.
darknet: a `bool` indicating whether the calling function is the
yolo
darknet: a `bool` indicating whether the calling function is the
YOLO
darknet loss.
darknet loss.
Returns:
Returns:
...
@@ -256,13 +256,13 @@ def compute_ciou(box1, box2, yxyx=False, darknet=False):
...
@@ -256,13 +256,13 @@ def compute_ciou(box1, box2, yxyx=False, darknet=False):
"""Calculates the complete intersection over union between box1 and box2.
"""Calculates the complete intersection over union between box1 and box2.
Args:
Args:
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
box1: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
box2: any `Tensor` whose last dimension is 4 representing the coordinates of
boxes.
boxes.
yxyx: a `bool` indicating whether the input box is of the format x_center
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
y_center, width, height or y_min, x_min, y_max, x_max.
darknet: a `bool` indicating whether the calling function is the
yolo
darknet: a `bool` indicating whether the calling function is the
YOLO
darknet loss.
darknet loss.
Returns:
Returns:
...
@@ -297,23 +297,22 @@ def aggregated_comparitive_iou(boxes1,
...
@@ -297,23 +297,22 @@ def aggregated_comparitive_iou(boxes1,
boxes2
=
None
,
boxes2
=
None
,
iou_type
=
0
,
iou_type
=
0
,
beta
=
0.6
):
beta
=
0.6
):
"""Calculates the intersection over union between every box in boxes1 and
"""Calculates the intersection over union between every box in boxes1 and
every box in boxes2.
every box in boxes2.
Args:
Args:
boxes1: a `Tensor` of shape [batch size, N, 4] representing the coordinates
boxes1: a `Tensor` of shape [batch size, N, 4] representing the coordinates
of boxes.
of boxes.
boxes2: a `Tensor` of shape [batch size, N, 4] representing the coordinates
boxes2: a `Tensor` of shape [batch size, N, 4] representing the coordinates
of boxes.
of boxes.
iou_type: `integer` representing the iou version to use, 0 is distance iou,
iou_type: `integer` representing the iou version to use, 0 is distance iou,
1 is the general iou, 2 is the complete iou, any other number uses the
1 is the general iou, 2 is the complete iou, any other number uses the
standard iou.
standard iou.
beta: `float` for the scaling quantity to apply to distance iou
beta: `float` for the scaling quantity to apply to distance iou
regularization.
regularization.
Returns:
Returns:
iou: a `Tensor` who represents the intersection over union in of the
iou: a `Tensor` who represents the intersection over union in of the
expected/input type.
expected/input type.
"""
"""
boxes1
=
tf
.
expand_dims
(
boxes1
,
axis
=-
2
)
boxes1
=
tf
.
expand_dims
(
boxes1
,
axis
=-
2
)
...
...
official/vision/beta/projects/yolo/ops/math_ops.py
View file @
64f16d61
"""A set of private math operations used to safely implement the
yolo
loss"""
"""A set of private math operations used to safely implement the
YOLO
loss
.
"""
import
tensorflow
as
tf
import
tensorflow
as
tf
def
rm_nan_inf
(
x
,
val
=
0.0
):
def
rm_nan_inf
(
x
,
val
=
0.0
):
"""remove nan and infinity
"""remove nan and infinity
Args:
Args:
x: any `Tensor` of any type.
x: any `Tensor` of any type.
val: value to replace nan and infinity with.
val: value to replace nan and infinity with.
Return:
Return:
a `Tensor` with nan and infinity removed.
a `Tensor` with nan and infinity removed.
...
@@ -19,11 +19,11 @@ def rm_nan_inf(x, val=0.0):
...
@@ -19,11 +19,11 @@ def rm_nan_inf(x, val=0.0):
def
rm_nan
(
x
,
val
=
0.0
):
def
rm_nan
(
x
,
val
=
0.0
):
"""
r
emove nan and infinity.
"""
R
emove nan and infinity.
Args:
Args:
x: any `Tensor` of any type.
x: any `Tensor` of any type.
val: value to replace nan.
val: value to replace nan.
Return:
Return:
a `Tensor` with nan removed.
a `Tensor` with nan removed.
...
@@ -35,32 +35,32 @@ def rm_nan(x, val=0.0):
...
@@ -35,32 +35,32 @@ def rm_nan(x, val=0.0):
def
divide_no_nan
(
a
,
b
):
def
divide_no_nan
(
a
,
b
):
"""Nan safe divide operation built to allow model compilation in tflite.
"""Nan safe divide operation built to allow model compilation in tflite.
Args:
Args:
a: any `Tensor` of any type.
a: any `Tensor` of any type.
b: any `Tensor` of any type with the same shape as tensor a.
b: any `Tensor` of any type with the same shape as tensor a.
Return:
Return:
a `Tensor` representing a divided by b, with all nan values removed.
a `Tensor` representing a divided by b, with all nan values removed.
"""
"""
zero
=
tf
.
cast
(
0.0
,
b
.
dtype
)
zero
=
tf
.
cast
(
0.0
,
b
.
dtype
)
return
tf
.
where
(
b
==
zero
,
zero
,
a
/
b
)
return
tf
.
where
(
b
==
zero
,
zero
,
a
/
b
)
def
mul_no_nan
(
x
,
y
):
def
mul_no_nan
(
x
,
y
):
"""Nan safe multiply operation built to allow model compilation in tflite and
"""Nan safe multiply operation built to allow model compilation in tflite and
to allow
ing
one tensor to mask another. Where ever x is zero the
to allow one tensor to mask another. Where ever x is zero the
multiplication is not computed and the value is replaced with a zero. This is
multiplication is not computed and the value is replaced with a zero. This is
requred because 0 * nan = nan. This can make computation unstable in some
requ
i
red because 0 * nan = nan. This can make computation unstable in some
cases where the intended behavior is for zero to mean ignore.
cases where the intended behavior is for zero to mean ignore.
Args:
Args:
x: any `Tensor` of any type.
x: any `Tensor` of any type.
y: any `Tensor` of any type with the same shape as tensor x.
y: any `Tensor` of any type with the same shape as tensor x.
Return:
Return:
a `Tensor` representing x times y, where x is used to safely mask the
a `Tensor` representing x times y, where x is used to safely mask the
tensor y.
tensor y.
"""
"""
return
tf
.
where
(
x
==
0
,
tf
.
cast
(
0
,
x
.
dtype
),
x
*
y
)
return
tf
.
where
(
x
==
0
,
tf
.
cast
(
0
,
x
.
dtype
),
x
*
y
)
official/vision/beta/projects/yolo/ops/nms_ops.py
View file @
64f16d61
...
@@ -8,13 +8,12 @@ class TiledNMS():
...
@@ -8,13 +8,12 @@ class TiledNMS():
IOU_TYPES
=
{
'diou'
:
0
,
'giou'
:
1
,
'ciou'
:
2
,
'iou'
:
3
}
IOU_TYPES
=
{
'diou'
:
0
,
'giou'
:
1
,
'ciou'
:
2
,
'iou'
:
3
}
def
__init__
(
self
,
iou_type
=
'diou'
,
beta
=
0.6
):
def
__init__
(
self
,
iou_type
=
'diou'
,
beta
=
0.6
):
'''initialization for all non max supression operations mainly used to
'''initialization for all non max sup
p
ression operations mainly used to
select hyperp
e
ramters for the iou type and scaling.
select hyperp
a
ram
e
ters for the iou type and scaling.
Args:
Args:
iou_type: `str` for the version of IOU to use {diou, giou, ciou, iou}.
iou_type: `str` for the version of IOU to use {diou, giou, ciou, iou}.
beta: `float` for the amount to scale regualrization on distance iou.
beta: `float` for the amount to scale regularization on distance iou.
'''
'''
self
.
_iou_type
=
TiledNMS
.
IOU_TYPES
[
iou_type
]
self
.
_iou_type
=
TiledNMS
.
IOU_TYPES
[
iou_type
]
self
.
_beta
=
beta
self
.
_beta
=
beta
...
@@ -54,7 +53,7 @@ class TiledNMS():
...
@@ -54,7 +53,7 @@ class TiledNMS():
overlap too much with respect to IOU.
overlap too much with respect to IOU.
output_size: an int32 tensor of size [batch_size]. Representing the number
output_size: an int32 tensor of size [batch_size]. Representing the number
of selected boxes for each batch.
of selected boxes for each batch.
idx: an integer scalar representing induction variable.
idx: an integer scalar representing
an
induction variable.
Returns:
Returns:
boxes: updated boxes.
boxes: updated boxes.
...
@@ -111,10 +110,10 @@ class TiledNMS():
...
@@ -111,10 +110,10 @@ class TiledNMS():
Assumption:
Assumption:
* The boxes are sorted by scores unless the box is a dot (all coordinates
* The boxes are sorted by scores unless the box is a dot (all coordinates
are zero).
are zero).
* Boxes with higher scores can be used to suppress boxes with lower
* Boxes with higher scores can be used to suppress boxes with lower
scores.
scores.
The overal design of the algorithm is to handle boxes tile-by-tile:
The overal
l
design of the algorithm is to handle boxes tile-by-tile:
boxes = boxes.pad_to_multiply_of(tile_size)
boxes = boxes.pad_to_multiply_of(tile_size)
num_tiles = len(boxes) // tile_size
num_tiles = len(boxes) // tile_size
...
@@ -126,7 +125,7 @@ class TiledNMS():
...
@@ -126,7 +125,7 @@ class TiledNMS():
iou = bbox_overlap(box_tile, suppressing_tile)
iou = bbox_overlap(box_tile, suppressing_tile)
# if the box is suppressed in iou, clear it to a dot
# if the box is suppressed in iou, clear it to a dot
box_tile *= _update_boxes(iou)
box_tile *= _update_boxes(iou)
# Iteratively handle the diagnal tile.
# Iteratively handle the diag
o
nal tile.
iou = _box_overlap(box_tile, box_tile)
iou = _box_overlap(box_tile, box_tile)
iou_changed = True
iou_changed = True
while iou_changed:
while iou_changed:
...
@@ -232,16 +231,16 @@ class TiledNMS():
...
@@ -232,16 +231,16 @@ class TiledNMS():
This implementation unrolls classes dimension while using the tf.while_loop
This implementation unrolls classes dimension while using the tf.while_loop
to implement the batched NMS, so that it can be parallelized at the batch
to implement the batched NMS, so that it can be parallelized at the batch
dimension. It should give better performance compar
ing
to v1 implementation.
dimension. It should give better performance compar
ed
to v1 implementation.
It is TPU compatible.
It is TPU compatible.
Args:
Args:
boxes: a tensor with shape [batch_size, N, num_classes, 4] or [batch_size,
boxes: a tensor with shape [batch_size, N, num_classes, 4] or [batch_size,
N, 1, 4], which box predictions on all feature levels. The N is the
N, 1, 4], which box predictions on all feature levels. The N is the
number of total anchors on all levels.
number of total anchors on all levels.
scores: a tensor with shape [batch_size, N, num_classes], which stacks
scores: a tensor with shape [batch_size, N, num_classes], which stacks
class probability on all feature levels. The N is the number of total
class probability on all feature levels. The N is the number of total
anchors on all levels. The num_classes is the number of classes the
anchors on all levels. The num_classes is the number of classes the
model predicted. Note that the class_outputs here is the raw score.
model predicted. Note that the class_outputs here is the raw score.
pre_nms_top_k: an int number of top candidate detections per class
pre_nms_top_k: an int number of top candidate detections per class
before NMS.
before NMS.
...
@@ -327,21 +326,21 @@ def sorted_non_max_suppression_padded(scores, boxes, max_output_size,
...
@@ -327,21 +326,21 @@ def sorted_non_max_suppression_padded(scores, boxes, max_output_size,
def
sort_drop
(
objectness
,
box
,
classificationsi
,
k
):
def
sort_drop
(
objectness
,
box
,
classificationsi
,
k
):
"""This function sorts and drops boxes such that there are only k boxes
"""This function sorts and drops boxes such that there are only k boxes
sorted by number the objectness or confidence
sorted by number the objectness or confidence
Args:
Args:
objectness: a `Tensor` of shape [batch size, N] that needs to be
objectness: a `Tensor` of shape [batch size, N] that needs to be
filtered.
filtered.
box: a `Tensor` of shape [batch size, N, 4] that needs to be filtered.
box: a `Tensor` of shape [batch size, N, 4] that needs to be filtered.
classificationsi: a `Tensor` of shape [batch size, N, num_classes] that
classificationsi: a `Tensor` of shape [batch size, N, num_classes] that
needs to be filtered.
needs to be filtered.
k: a `integer` for the maximum number of boxes to keep after filtering
k: a `integer` for the maximum number of boxes to keep after filtering
Return:
Return:
objectness: filtered `Tensor` of shape [batch size, k]
objectness: filtered `Tensor` of shape [batch size, k]
boxes: filtered `Tensor` of shape [batch size, k, 4]
boxes: filtered `Tensor` of shape [batch size, k, 4]
classifications: filtered `Tensor` of shape [batch size, k, num_classes]
classifications: filtered `Tensor` of shape [batch size, k, num_classes]
"""
"""
# find rhe indexes for the boxes based on the scores
# find rhe indexes for the boxes based on the scores
objectness
,
ind
=
tf
.
math
.
top_k
(
objectness
,
k
=
k
)
objectness
,
ind
=
tf
.
math
.
top_k
(
objectness
,
k
=
k
)
...
@@ -364,25 +363,25 @@ def sort_drop(objectness, box, classificationsi, k):
...
@@ -364,25 +363,25 @@ def sort_drop(objectness, box, classificationsi, k):
def
segment_nms
(
boxes
,
classes
,
confidence
,
k
,
iou_thresh
):
def
segment_nms
(
boxes
,
classes
,
confidence
,
k
,
iou_thresh
):
"""This is a quick nms that works on very well for small values of k, this
"""This is a quick nms that works on very well for small values of k, this
was developed to operate for tflite models as the tiled NMS is far too slow
was developed to operate for tflite models as the tiled NMS is far too slow
and typically is not able to compile with tflite. This NMS does not account
and typically is not able to compile with tflite. This NMS does not account
for classes, and only works to quickly filter boxes on phones.
for classes, and only works to quickly filter boxes on phones.
Args:
Args:
boxes: a `Tensor` of shape [batch size, N, 4] that needs to be filtered.
boxes: a `Tensor` of shape [batch size, N, 4] that needs to be filtered.
classes: a `Tensor` of shape [batch size, N, num_classes] that needs to be
classes: a `Tensor` of shape [batch size, N, num_classes] that needs to be
filtered.
filtered.
confidence: a `Tensor` of shape [batch size, N] that needs to be
confidence: a `Tensor` of shape [batch size, N] that needs to be
filtered.
filtered.
k: a `integer` for the maximum number of boxes to keep after filtering
k: a `integer` for the maximum number of boxes to keep after filtering
iou_thresh: a `float` for the value above which boxes are consdered to be
iou_thresh: a `float` for the value above which boxes are cons
i
dered to be
too similar, the closer to 1.0 the less that gets though.
too similar, the closer to 1.0 the less that gets th
r
ough.
Return:
Return:
boxes: filtered `Tensor` of shape [batch size, k, 4]
boxes: filtered `Tensor` of shape [batch size, k, 4]
classes: filtered `Tensor` of shape [batch size, k, num_classes] t
classes: filtered `Tensor` of shape [batch size, k, num_classes] t
confidence: filtered `Tensor` of shape [batch size, k]
confidence: filtered `Tensor` of shape [batch size, k]
"""
"""
mrange
=
tf
.
range
(
k
)
mrange
=
tf
.
range
(
k
)
mask_x
=
tf
.
tile
(
mask_x
=
tf
.
tile
(
...
@@ -416,27 +415,27 @@ def nms(boxes,
...
@@ -416,27 +415,27 @@ def nms(boxes,
pre_nms_thresh
,
pre_nms_thresh
,
nms_thresh
,
nms_thresh
,
prenms_top_k
=
500
):
prenms_top_k
=
500
):
"""This is a quick nms that works on very well for small values of k, this
"""This is a quick nms that works on very well for small values of k, this
was developed to operate for tflite models as the tiled NMS is far too slow
was developed to operate for tflite models as the tiled NMS is far too slow
and typically is not able to compile with tflite. This NMS does not account
and typically is not able to compile with tflite. This NMS does not account
for classes, and only works to quickly filter boxes on phones.
for classes, and only works to quickly filter boxes on phones.
Args:
Args:
boxes: a `Tensor` of shape [batch size, N, 4] that needs to be filtered.
boxes: a `Tensor` of shape [batch size, N, 4] that needs to be filtered.
classes: a `Tensor` of shape [batch size, N, num_classes] that needs to be
classes: a `Tensor` of shape [batch size, N, num_classes] that needs to be
filtered.
filtered.
confidence: a `Tensor` of shape [batch size, N] that needs to be
confidence: a `Tensor` of shape [batch size, N] that needs to be
filtered.
filtered.
k: a `integer` for the maximum number of boxes to keep after filtering
k: a `integer` for the maximum number of boxes to keep after filtering
nms_thresh: a `float` for the value above which boxes are consdered to be
nms_thresh: a `float` for the value above which boxes are cons
i
dered to be
too similar, the closer to 1.0 the less that gets though.
too similar, the closer to 1.0 the less that gets th
r
ough.
pre_nms_top_k: an int number of top candidate detections per class
pre_nms_top_k: an int number of top candidate detections per class
before NMS.
before NMS.
Return:
Return:
boxes: filtered `Tensor` of shape [batch size, k, 4]
boxes: filtered `Tensor` of shape [batch size, k, 4]
classes: filtered `Tensor` of shape [batch size, k, num_classes]
classes: filtered `Tensor` of shape [batch size, k, num_classes]
confidence: filtered `Tensor` of shape [batch size, k]
confidence: filtered `Tensor` of shape [batch size, k]
"""
"""
# sort the boxes
# sort the boxes
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment