Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
64f16d61
Commit
64f16d61
authored
Jun 14, 2021
by
Akhil Chinnakotla
Browse files
Grammar & Spelling Fixes
parent
c02980f4
Changes
10
Show whitespace changes
Inline
Side-by-side
Showing
10 changed files
with
196 additions
and
197 deletions
+196
-197
official/vision/beta/projects/yolo/README.md
official/vision/beta/projects/yolo/README.md
+22
-22
official/vision/beta/projects/yolo/configs/darknet_classification.py
...sion/beta/projects/yolo/configs/darknet_classification.py
+1
-1
official/vision/beta/projects/yolo/modeling/backbones/darknet.py
...l/vision/beta/projects/yolo/modeling/backbones/darknet.py
+8
-7
official/vision/beta/projects/yolo/modeling/backbones/darknet_test.py
...ion/beta/projects/yolo/modeling/backbones/darknet_test.py
+1
-1
official/vision/beta/projects/yolo/modeling/decoders/yolo_decoder.py
...sion/beta/projects/yolo/modeling/decoders/yolo_decoder.py
+8
-6
official/vision/beta/projects/yolo/modeling/heads/yolo_head_test.py
...ision/beta/projects/yolo/modeling/heads/yolo_head_test.py
+1
-2
official/vision/beta/projects/yolo/modeling/layers/nn_blocks.py
...al/vision/beta/projects/yolo/modeling/layers/nn_blocks.py
+61
-62
official/vision/beta/projects/yolo/ops/box_ops.py
official/vision/beta/projects/yolo/ops/box_ops.py
+32
-33
official/vision/beta/projects/yolo/ops/math_ops.py
official/vision/beta/projects/yolo/ops/math_ops.py
+19
-19
official/vision/beta/projects/yolo/ops/nms_ops.py
official/vision/beta/projects/yolo/ops/nms_ops.py
+43
-44
No files found.
official/vision/beta/projects/yolo/README.md
View file @
64f16d61
...
...
@@ -14,30 +14,30 @@ repository.
## Description
Y
olo
v1 the original implementation was released in 2015 providing a ground
breaking
algorithm that would quickly process images
,
and locate objects in a
single pass through the detector. The original implementation
based
used a
backbone derived from state of the art object classifier of the time, like
Y
OLO
v1 the original implementation was released in 2015 providing a ground
breaking
algorithm that would quickly process images and locate objects in a
single pass through the detector. The original implementation used a
backbone derived from state of the art object classifier
s
of the time, like
[
GoogLeNet
](
https://arxiv.org/abs/1409.4842
)
and
[
VGG
](
https://arxiv.org/abs/1409.1556
)
. More attention was given to the novel
Y
olo
Detection head that allowed for Object Detection with a single pass of an
Y
OLO
Detection head that allowed for Object Detection with a single pass of an
image. Though limited, the network could predict up to 90 bounding boxes per
image, and was tested for about 80 classes per box. Also, the model c
ould
only
make prediction at one scale. These attributes caused
yolo
v1 to be more
limited
,
and less versatile, so as the year passed, the Developers continued to
image, and was tested for about 80 classes per box. Also, the model c
an
only
make prediction
s
at one scale. These attributes caused
YOLO
v1 to be more
limited and less versatile, so as the year passed, the Developers continued to
update and develop this model.
Y
olo
v3 and v4 serve as the most up to date and capable versions of the Y
olo
network group. Th
ese
model uses a custom backbone called Darknet53 that uses
knowledge gained from the ResNet paper to improve its predictions. The new
backbone
also allows for objects to be detected at multiple scales. As for the
new detection head,
the model now predicts the bounding boxes using a set of
anchor box priors (Anchor
Boxes) as suggestions.
The m
ultiscale predictions in
combination with the Anchor boxes allows for the network to make up to 1000
object predictions on a single image. Finally, the new loss function forces the
network to make better prediction by using Intersection Over Union (IOU) to
inform the model's confidence rather than relying on the mean squared error for
the entire output.
Y
OLO
v3 and v4 serve as the most up to date and capable versions of the Y
OLO
network group. Th
is
model uses a custom backbone called Darknet53 that uses
knowledge gained from the ResNet paper to improve its predictions. The new
backbone
also allows for objects to be detected at multiple scales. As for the
new detection head,
the model now predicts the bounding boxes using a set of
anchor box priors (Anchor
Boxes) as suggestions.
M
ultiscale predictions in
combination with Anchor boxes allow
for the network to make up to 1000 object predictions on a single image. Finally,
the new loss function forces the network to make better predictions by using Intersection
Over Union (IOU) to inform the model's confidence rather than relying on the mean
squared error for the entire output.
## Authors
...
...
@@ -56,9 +56,9 @@ the entire output.
## Our Goal
Our goal with this model conversion is to provide implementation
s
of the
Backbone
and Y
olo
Head. We have built the model in such a way that the Y
olo
head could be
connected to a new, more powerful backbone if a person chose to.
Our goal with this model conversion is to provide implementation of the
Backbone
and Y
OLO
Head. We have built the model in such a way that the Y
OLO head could be
connected to a new, more powerful backbone if a person chose to.
## Models in the library
...
...
official/vision/beta/projects/yolo/configs/darknet_classification.py
View file @
64f16d61
...
...
@@ -35,7 +35,7 @@ class ImageClassificationModel(hyperparams.Config):
type
=
'darknet'
,
darknet
=
backbones
.
Darknet
())
dropout_rate
:
float
=
0.0
norm_activation
:
common
.
NormActivation
=
common
.
NormActivation
()
# Adds a BatchNormalization layer pre-GlobalAveragePooling in classification
# Adds a Batch
Normalization layer pre-GlobalAveragePooling in classification
.
add_head_batch_norm
:
bool
=
False
...
...
official/vision/beta/projects/yolo/modeling/backbones/darknet.py
View file @
64f16d61
...
...
@@ -16,7 +16,7 @@
"""Contains definitions of Darknet Backbone Networks.
The models are inspired by ResNet
,
and CSPNet
Th
es
e models are inspired by ResNet and CSPNet
.
Residual networks (ResNets) were proposed in:
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
...
...
@@ -49,7 +49,7 @@ from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
class
BlockConfig
:
"""
C
lass to store layer config to make code more readable
This is a c
lass to store layer config to make code more readable
.
"""
def
__init__
(
self
,
layer
,
stack
,
reps
,
bottleneck
,
filters
,
pool_size
,
...
...
@@ -69,7 +69,7 @@ class BlockConfig:
padding: An `int` for the padding to apply to layers in this stack.
activation: A `str` for the activation to use for this stack.
route: An `int` for the level to route from to get the next input.
dilation_rate: An `int` for the scale used in di
a
lated Darknet.
dilation_rate: An `int` for the scale used in dilated Darknet.
output_name: A `str` for the name to use for this output.
is_output: A `bool` for whether this layer is an output in the default
model.
...
...
@@ -99,9 +99,10 @@ def build_block_specs(config):
class
LayerBuilder
:
"""
class for quick look up of default layers used by darknet to
connect, introduce or exit a level. Used in place of an if condition
or switch to make adding new layers easier and to reduce redundant code
This is a class that is used for quick look up of default layers used
by darknet to connect, introduce or exit a level. Used in place of an
if condition or switch to make adding new layers easier and to reduce
redundant code.
"""
def
__init__
(
self
):
...
...
@@ -377,7 +378,7 @@ BACKBONES = {
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
Darknet
(
tf
.
keras
.
Model
):
""" The Darknet backbone architecture """
""" The Darknet backbone architecture
.
"""
def
__init__
(
self
,
...
...
official/vision/beta/projects/yolo/modeling/backbones/darknet_test.py
View file @
64f16d61
...
...
@@ -13,7 +13,7 @@
# limitations under the License.
# Lint as: python3
"""Tests for
yolo
."""
"""Tests for
YOLO
."""
from
absl.testing
import
parameterized
import
numpy
as
np
...
...
official/vision/beta/projects/yolo/modeling/decoders/yolo_decoder.py
View file @
64f16d61
...
...
@@ -13,7 +13,7 @@
# limitations under the License.
# Lint as: python3
"""Feature Pyramid Network and Path Aggregation variants used in YOLO"""
"""Feature Pyramid Network and Path Aggregation variants used in YOLO
.
"""
import
tensorflow
as
tf
from
official.vision.beta.projects.yolo.modeling.layers
import
nn_blocks
...
...
@@ -23,8 +23,10 @@ from official.vision.beta.projects.yolo.modeling.layers import nn_blocks
class
_IdentityRoute
(
tf
.
keras
.
layers
.
Layer
):
def
__init__
(
self
,
**
kwargs
):
"""Private class to mirror the outputs of blocks in nn_blocks for an easier
programatic generation of the feature pyramid network"""
"""
Private class to mirror the outputs of blocks in nn_blocks for an easier
programatic generation of the feature pyramid network.
"""
super
().
__init__
(
**
kwargs
)
...
...
@@ -125,7 +127,7 @@ class YoloFPN(tf.keras.layers.Layer):
# directly connect to an input path and process it
self
.
preprocessors
=
dict
()
# resample an input and merge it with the output of another path
# inorder to aggregate backbone outputs
# in
order to aggregate backbone outputs
self
.
resamples
=
dict
()
# set of convoltion layers and upsample layers that are used to
# prepare the FPN processors for output
...
...
@@ -214,7 +216,7 @@ class YoloPAN(tf.keras.layers.Layer):
kernel_initializer: kernel_initializer for convolutional layers.
kernel_regularizer: tf.keras.regularizers.Regularizer object for Conv2D.
bias_regularizer: tf.keras.regularizers.Regularizer object for Conv2d.
fpn_input: `bool`, for whether the input into this fu
c
ntion is an FPN or
fpn_input: `bool`, for whether the input into this fun
c
tion is an FPN or
a backbone.
fpn_filter_scale: `int`, scaling factor for the FPN filters.
**kwargs: keyword arguments to be passed.
...
...
@@ -268,7 +270,7 @@ class YoloPAN(tf.keras.layers.Layer):
# directly connect to an input path and process it
self
.
preprocessors
=
dict
()
# resample an input and merge it with the output of another path
# inorder to aggregate backbone outputs
# in
order to aggregate backbone outputs
self
.
resamples
=
dict
()
# FPN will reverse the key process order for the backbone, so we need
...
...
official/vision/beta/projects/yolo/modeling/heads/yolo_head_test.py
View file @
64f16d61
...
...
@@ -13,7 +13,7 @@
# limitations under the License.
# Lint as: python3
"""Tests for
yolo
heads."""
"""Tests for
YOLO
heads."""
# Import libraries
from
absl.testing
import
parameterized
...
...
@@ -44,7 +44,6 @@ class YoloDecoderTest(parameterized.TestCase, tf.test.TestCase):
inputs
[
key
]
=
tf
.
ones
(
input_shape
[
key
],
dtype
=
tf
.
float32
)
endpoints
=
head
(
inputs
)
# print(endpoints)
for
key
in
endpoints
.
keys
():
expected_input_shape
=
input_shape
[
key
]
...
...
official/vision/beta/projects/yolo/modeling/layers/nn_blocks.py
View file @
64f16d61
...
...
@@ -14,7 +14,7 @@
# Lint as: python3
"""Contains common building blocks for
yolo
neural networks."""
"""Contains common building blocks for
YOLO
neural networks."""
from
typing
import
Callable
,
List
import
tensorflow
as
tf
from
official.modeling
import
tf_utils
...
...
@@ -35,9 +35,9 @@ class Identity(tf.keras.layers.Layer):
class
ConvBN
(
tf
.
keras
.
layers
.
Layer
):
"""
Modified Convolution layer to match that of the Darknet Library.
The Layer is a standard
s
combination of Conv BatchNorm Activation,
however, the use of bias in the
c
onv is determined by the use of
batch
normalization.
The Layer is a standard combination of Conv BatchNorm Activation,
however, the use of bias in the
C
onv is determined by the use of
batch
normalization.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
Ping-Yang Chen, Jun-Wei Hsieh
...
...
@@ -71,16 +71,16 @@ class ConvBN(tf.keras.layers.Layer):
use.
padding: string 'valid' or 'same', if same, then pad the image, else do
not.
di
a
ltion_rate: tuple to indicate how much to modulate kernel weights and
dil
a
tion_rate: tuple to indicate how much to modulate kernel weights and
how many pixels in a feature map to skip.
kernel_initializer: string to indicate which function to use to initialize
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
...
...
@@ -191,7 +191,7 @@ class ConvBN(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
DarkResidual
(
tf
.
keras
.
layers
.
Layer
):
"""
Darknet block with Residual connection for Y
olo
v3 Backbone
Darknet block with Residual connection for Y
OLO
v3 Backbone
"""
def
__init__
(
self
,
...
...
@@ -228,8 +228,6 @@ class DarkResidual(tf.keras.layers.Layer):
(across all input batches).
norm_momentum: float for moment to use for batch normalization.
norm_epsilon: float for batch normalization epsilon.
conv_activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
leaky_alpha: float to use as alpha if activation function is leaky.
sc_activation: string for activation function to use in layer.
downsample: boolean for if image input is larger than layer output, set
...
...
@@ -352,10 +350,10 @@ class DarkResidual(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
CSPTiny
(
tf
.
keras
.
layers
.
Layer
):
"""
A
S
mall size convolution block proposed in the CSPNet. The layer uses
shortcuts,
routing(concatnation), and feature grouping in order to improve
gradient
variablity and allow for high efficency, low power residual learning
for small networtf.kera
s.
A
s
mall size convolution block proposed in the CSPNet. The layer uses
shortcuts,
routing(concat
e
nation), and feature grouping in order to improve
gradient
variab
i
lity and allow for high effic
i
ency, low power residual learning
for small
network
s.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
Ping-Yang Chen, Jun-Wei Hsieh
...
...
@@ -387,11 +385,11 @@ class CSPTiny(tf.keras.layers.Layer):
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
use_bn: boolean for whether to use batch normalization.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
(across all input batches).
...
...
@@ -401,12 +399,12 @@ class CSPTiny(tf.keras.layers.Layer):
feature stack output.
norm_momentum: float for moment to use for batch normalization.
norm_epsilon: float for batch normalization epsilon.
conv_activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
leaky_alpha: float to use as alpha if activation function is leaky.
sc_activation: string for activation function to use in layer.
downsample: boolean for if image input is larger than layer output, set
downsample to True so the dimensions are forced to match.
leaky_alpha: float to use as alpha if activation function is leaky.
sc_activation: string for activation function to use in layer.
conv_activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
**kwargs: Keyword Arguments.
"""
...
...
@@ -505,18 +503,18 @@ class CSPTiny(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
CSPRoute
(
tf
.
keras
.
layers
.
Layer
):
"""
Down sampling layer to take the place of down sampl
e
ing done in Residual
Down sampling layer to take the place of down sampling done in Residual
networks. This is the first of 2 layers needed to convert any Residual Network
model to a CSPNet. At the start of a new level change, this CSPRoute layer
creates a learned identity that will act as a cross stage connection
,
that
is used to inform the inputs to the next stage.
It
is called cross stage
partial because the number of filters required in every intermitent
R
esidual
creates a learned identity that will act as a cross stage connection
that
is used to inform the inputs to the next stage.
This
is called cross stage
partial because the number of filters required in every intermit
t
ent
r
esidual
layer is reduced by half. The sister layer will take the partial generated by
this layer and concatnate it with the output of the final residual layer in
the
stack to create a fully feature level output. This concatnation merges the
this layer and concat
e
nate it with the output of the final residual layer in
the
stack to create a fully feature level output. This concat
e
nation merges the
partial blocks of 2 levels as input to the next allowing the gradients of each
level to be more unique, and reducing the number of parameters required by
each
level by 50% while keeping accuracy consistent.
level to be more unique, and reducing the number of parameters required by
each
level by 50% while keeping accuracy consistent.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
...
...
@@ -544,24 +542,24 @@ class CSPRoute(tf.keras.layers.Layer):
"""
Args:
filters: integer for output depth, or the number of features to learn
filter_scale: integer dicating (filters//2) or the number of filters in
filter_scale: integer dic
t
ating (filters//2) or the number of filters in
the partial feature stack.
downsample: down_sample the input.
activation: string for activation function to use in layer.
kernel_initializer: string to indicate which function to use to
initialize weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
(across all input batches).
norm_momentum: float for moment to use for batch normalization.
norm_epsilon: float for batch normalization epsilon.
downsample: down_sample the input.
**kwargs: Keyword Arguments.
"""
...
...
@@ -571,7 +569,7 @@ class CSPRoute(tf.keras.layers.Layer):
self
.
_filter_scale
=
filter_scale
self
.
_activation
=
activation
# convo
u
ltion params
# convol
u
tion params
self
.
_kernel_initializer
=
kernel_initializer
self
.
_bias_initializer
=
bias_initializer
self
.
_kernel_regularizer
=
kernel_regularizer
...
...
@@ -638,7 +636,7 @@ class CSPRoute(tf.keras.layers.Layer):
class
CSPConnect
(
tf
.
keras
.
layers
.
Layer
):
"""
Sister Layer to the CSPRoute layer. Merges the partial feature stacks
generated by the CSPDownsampling layer, and the final
y
output of the
generated by the CSPDownsampling layer, and the final output of the
residual stack. Suggested in the CSPNet paper.
Cross Stage Partial networks (CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
...
...
@@ -675,10 +673,10 @@ class CSPConnect(tf.keras.layers.Layer):
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global
...
...
@@ -750,13 +748,13 @@ class CSPConnect(tf.keras.layers.Layer):
class
CSPStack
(
tf
.
keras
.
layers
.
Layer
):
"""
CSP full stack, combines the route and the connect in case you dont want to
j
s
ut quickly wrap an existing callable or list of layers to
make it a cross
stage partial. Added for ease of use. you should be able
to wrap any layer
stack with a CSP independent of wether it belongs
to the Darknet family. if
filter_scale = 2, then the blocks in the stack
passed into the the CSP stack
should also have filters = filters/filter_scale
Cross Stage Partial networks
(CSPNets) were proposed in:
CSP full stack, combines the route and the connect in case you don
'
t want to
ju
s
t quickly wrap an existing callable or list of layers to
make it a cross
stage partial. Added for ease of use. you should be able
to wrap any layer
stack with a CSP independent of w
h
ether it belongs
to the Darknet family. If
filter_scale = 2, then the blocks in the stack
passed into the the CSP stack
should also have filters = filters/filter_scale
Cross Stage Partial networks
(CSPNets) were proposed in:
[1] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu,
Ping-Yang Chen, Jun-Wei Hsieh
...
...
@@ -781,11 +779,10 @@ class CSPStack(tf.keras.layers.Layer):
**
kwargs
):
"""
Args:
filters: integer for output depth, or the number of features to learn.
model_to_wrap: callable Model or a list of callable objects that will
process the output of CSPRoute, and be input into CSPConnect.
list will be called sequentially.
downsample: down_sample the input.
filters: integer for output depth, or the number of features to learn.
filter_scale: integer dicating (filters//2) or the number of filters in
the partial feature stack.
activation: string for activation function to use in layer.
...
...
@@ -793,10 +790,11 @@ class CSPStack(tf.keras.layers.Layer):
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
downsample: down_sample the input.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
...
...
@@ -891,10 +889,10 @@ class PathAggregationBlock(tf.keras.layers.Layer):
weights.
bias_initializer: string to indicate which function to use to initialize
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
bias_regularizer: string to indicate which function to use to regularizer
bias.
kernel_regularizer: string to indicate which function to use to
regularizer weights.
use_bn: boolean for whether to use batch normalization.
use_sync_bn: boolean for whether sync batch normalization statistics
of all batch norm layers to the models global statistics
...
...
@@ -905,8 +903,8 @@ class PathAggregationBlock(tf.keras.layers.Layer):
activation: string or None for activation function to use in layer,
if None activation is replaced by linear.
leaky_alpha: float to use as alpha if activation function is leaky.
downsample: `bool` for whe
h
ter to down
w
ample and merge.
upsample: `bool` for whe
h
ter to upsample and merge.
downsample: `bool` for whet
h
er to down
s
ample and merge.
upsample: `bool` for whet
h
er to upsample and merge.
upsample_size: `int` how much to upsample in order to match shapes.
**kwargs: Keyword Arguments.
"""
...
...
@@ -1050,7 +1048,7 @@ class PathAggregationBlock(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
SPP
(
tf
.
keras
.
layers
.
Layer
):
"""
a
non-agregated SPP layer that uses Pooling to gain more performance
A
non-ag
g
regated SPP layer that uses Pooling to gain more performance
.
"""
def
__init__
(
self
,
sizes
,
**
kwargs
):
...
...
@@ -1090,7 +1088,7 @@ class SAM(tf.keras.layers.Layer):
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
i
mplementation of the Spatial Attention Model (SAM)
I
mplementation of the Spatial Attention Model (SAM)
"""
def
__init__
(
self
,
...
...
@@ -1167,7 +1165,7 @@ class CAM(tf.keras.layers.Layer):
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
i
mplementation of the Channel Attention Model (CAM)
I
mplementation of the Channel Attention Model (CAM)
"""
def
__init__
(
self
,
...
...
@@ -1253,7 +1251,7 @@ class CBAM(tf.keras.layers.Layer):
[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon
CBAM: Convolutional Block Attention Module. arXiv:1807.06521
i
mplementation of the Convolution Block Attention Module (CBAM)
I
mplementation of the Convolution Block Attention Module (CBAM)
"""
def
__init__
(
self
,
...
...
@@ -1321,8 +1319,9 @@ class CBAM(tf.keras.layers.Layer):
@
tf
.
keras
.
utils
.
register_keras_serializable
(
package
=
'yolo'
)
class
DarkRouteProcess
(
tf
.
keras
.
layers
.
Layer
):
"""
process darknet outputs and connect back bone to head more generalizably
Abstracts repetition of DarkConv objects that is common in YOLO.
Processes darknet outputs and connects the backbone to the head for more
generalizability and abstracts the repetition of DarkConv objects that is
common in YOLO.
It is used like the following:
...
...
@@ -1357,18 +1356,18 @@ class DarkRouteProcess(tf.keras.layers.Layer):
filters: the number of filters to be used in all subsequent layers
filters should be the depth of the tensor input into this layer,
as no downsampling can be done within this layer object.
repetitions: number of times to repeat the processi
g
n nodes
for tiny: 1 repition, no spp allowed
repetitions: number of times to repeat the processin
g
nodes
for tiny: 1 rep
et
ition, no spp allowed
for spp: insert_spp = True, and allow for 3+ repetitions
for regular: insert_spp = False, and allow for 3+ repetitions.
insert_spp: bool if true add the spatial pyramid pooling layer.
kernel_initializer: method to use to initializ
a
kernel weights.
kernel_initializer: method to use to initializ
e
kernel weights.
bias_initializer: method to use to initialize the bias of the conv
layers.
norm_momentum: batch norm parameter see TensorFlow documentation.
norm_epsilon: batch norm parameter see TensorFlow documentation.
activation: activation function to use in processing.
leaky_alpha: if leaky ac
i
tivation function, the alpha to use in
leaky_alpha: if leaky activation function, the alpha to use in
processing the relu input.
Returns:
...
...
official/vision/beta/projects/yolo/ops/box_ops.py
View file @
64f16d61
...
...
@@ -173,7 +173,7 @@ def compute_giou(box1, box2, yxyx=False, darknet=False):
boxes.
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
darknet: a `bool` indicating whether the calling function is the
yolo
darknet: a `bool` indicating whether the calling function is the
YOLO
darknet loss.
Returns:
...
...
@@ -216,7 +216,7 @@ def compute_diou(box1, box2, beta=1.0, yxyx=False, darknet=False):
regularization term.
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
darknet: a `bool` indicating whether the calling function is the
yolo
darknet: a `bool` indicating whether the calling function is the
YOLO
darknet loss.
Returns:
...
...
@@ -262,7 +262,7 @@ def compute_ciou(box1, box2, yxyx=False, darknet=False):
boxes.
yxyx: a `bool` indicating whether the input box is of the format x_center
y_center, width, height or y_min, x_min, y_max, x_max.
darknet: a `bool` indicating whether the calling function is the
yolo
darknet: a `bool` indicating whether the calling function is the
YOLO
darknet loss.
Returns:
...
...
@@ -311,7 +311,6 @@ def aggregated_comparitive_iou(boxes1,
beta: `float` for the scaling quantity to apply to distance iou
regularization.
Returns:
iou: a `Tensor` who represents the intersection over union in of the
expected/input type.
...
...
official/vision/beta/projects/yolo/ops/math_ops.py
View file @
64f16d61
"""A set of private math operations used to safely implement the
yolo
loss"""
"""A set of private math operations used to safely implement the
YOLO
loss
.
"""
import
tensorflow
as
tf
...
...
@@ -19,7 +19,7 @@ def rm_nan_inf(x, val=0.0):
def
rm_nan
(
x
,
val
=
0.0
):
"""
r
emove nan and infinity.
"""
R
emove nan and infinity.
Args:
x: any `Tensor` of any type.
...
...
@@ -50,9 +50,9 @@ def divide_no_nan(a, b):
def
mul_no_nan
(
x
,
y
):
"""Nan safe multiply operation built to allow model compilation in tflite and
to allow
ing
one tensor to mask another. Where ever x is zero the
to allow one tensor to mask another. Where ever x is zero the
multiplication is not computed and the value is replaced with a zero. This is
requred because 0 * nan = nan. This can make computation unstable in some
requ
i
red because 0 * nan = nan. This can make computation unstable in some
cases where the intended behavior is for zero to mean ignore.
Args:
...
...
official/vision/beta/projects/yolo/ops/nms_ops.py
View file @
64f16d61
...
...
@@ -8,13 +8,12 @@ class TiledNMS():
IOU_TYPES
=
{
'diou'
:
0
,
'giou'
:
1
,
'ciou'
:
2
,
'iou'
:
3
}
def
__init__
(
self
,
iou_type
=
'diou'
,
beta
=
0.6
):
'''initialization for all non max supression operations mainly used to
select hyperp
e
ramters for the iou type and scaling.
'''initialization for all non max sup
p
ression operations mainly used to
select hyperp
a
ram
e
ters for the iou type and scaling.
Args:
iou_type: `str` for the version of IOU to use {diou, giou, ciou, iou}.
beta: `float` for the amount to scale regualrization on distance iou.
beta: `float` for the amount to scale regularization on distance iou.
'''
self
.
_iou_type
=
TiledNMS
.
IOU_TYPES
[
iou_type
]
self
.
_beta
=
beta
...
...
@@ -54,7 +53,7 @@ class TiledNMS():
overlap too much with respect to IOU.
output_size: an int32 tensor of size [batch_size]. Representing the number
of selected boxes for each batch.
idx: an integer scalar representing induction variable.
idx: an integer scalar representing
an
induction variable.
Returns:
boxes: updated boxes.
...
...
@@ -114,7 +113,7 @@ class TiledNMS():
* Boxes with higher scores can be used to suppress boxes with lower
scores.
The overal design of the algorithm is to handle boxes tile-by-tile:
The overal
l
design of the algorithm is to handle boxes tile-by-tile:
boxes = boxes.pad_to_multiply_of(tile_size)
num_tiles = len(boxes) // tile_size
...
...
@@ -126,7 +125,7 @@ class TiledNMS():
iou = bbox_overlap(box_tile, suppressing_tile)
# if the box is suppressed in iou, clear it to a dot
box_tile *= _update_boxes(iou)
# Iteratively handle the diagnal tile.
# Iteratively handle the diag
o
nal tile.
iou = _box_overlap(box_tile, box_tile)
iou_changed = True
while iou_changed:
...
...
@@ -232,7 +231,7 @@ class TiledNMS():
This implementation unrolls classes dimension while using the tf.while_loop
to implement the batched NMS, so that it can be parallelized at the batch
dimension. It should give better performance compar
ing
to v1 implementation.
dimension. It should give better performance compar
ed
to v1 implementation.
It is TPU compatible.
Args:
...
...
@@ -376,8 +375,8 @@ def segment_nms(boxes, classes, confidence, k, iou_thresh):
confidence: a `Tensor` of shape [batch size, N] that needs to be
filtered.
k: a `integer` for the maximum number of boxes to keep after filtering
iou_thresh: a `float` for the value above which boxes are consdered to be
too similar, the closer to 1.0 the less that gets though.
iou_thresh: a `float` for the value above which boxes are cons
i
dered to be
too similar, the closer to 1.0 the less that gets th
r
ough.
Return:
boxes: filtered `Tensor` of shape [batch size, k, 4]
...
...
@@ -428,8 +427,8 @@ def nms(boxes,
confidence: a `Tensor` of shape [batch size, N] that needs to be
filtered.
k: a `integer` for the maximum number of boxes to keep after filtering
nms_thresh: a `float` for the value above which boxes are consdered to be
too similar, the closer to 1.0 the less that gets though.
nms_thresh: a `float` for the value above which boxes are cons
i
dered to be
too similar, the closer to 1.0 the less that gets th
r
ough.
pre_nms_top_k: an int number of top candidate detections per class
before NMS.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment