Unverified Commit 658c84c8 authored by Hongkun Yu's avatar Hongkun Yu Committed by GitHub
Browse files

Remove differential_privacy and morph_net from research folder (#8121)

* Remove differential_privacy and morph_net from research folder because they have been migrated to google-research/ for a while

* Update README.md

* Update CODEOWNERS
parent 1b3b2839
...@@ -16,10 +16,8 @@ ...@@ -16,10 +16,8 @@
/research/deep_contextual_bandits/ @rikel /research/deep_contextual_bandits/ @rikel
/research/deeplab/ @aquariusjay @yknzhu @gpapan /research/deeplab/ @aquariusjay @yknzhu @gpapan
/research/delf/ @andrefaraujo /research/delf/ @andrefaraujo
/research/differential_privacy/ @ilyamironov @ananthr
/research/domain_adaptation/ @bousmalis @dmrd /research/domain_adaptation/ @bousmalis @dmrd
/research/efficient-hrl/ @ofirnachum /research/efficient-hrl/ @ofirnachum
/research/gan/ @joel-shor
/research/global_objectives/ @mackeya-google /research/global_objectives/ @mackeya-google
/research/im2txt/ @cshallue /research/im2txt/ @cshallue
/research/inception/ @shlens @vincentvanhoucke /research/inception/ @shlens @vincentvanhoucke
...@@ -34,7 +32,6 @@ ...@@ -34,7 +32,6 @@
/research/lstm_object_detection/ @dreamdragon @masonliuw @yinxiaoli @yongzhe2160 /research/lstm_object_detection/ @dreamdragon @masonliuw @yinxiaoli @yongzhe2160
/research/marco/ @vincentvanhoucke /research/marco/ @vincentvanhoucke
/research/maskgan/ @a-dai /research/maskgan/ @a-dai
/research/morph_net/ @gariel-google
/research/namignizer/ @knathanieltucker /research/namignizer/ @knathanieltucker
/research/neural_gpu/ @lukaszkaiser /research/neural_gpu/ @lukaszkaiser
/research/neural_programmer/ @arvind2505 /research/neural_programmer/ @arvind2505
......
...@@ -28,8 +28,6 @@ request. ...@@ -28,8 +28,6 @@ request.
- [deep_speech](deep_speech): automatic speech recognition. - [deep_speech](deep_speech): automatic speech recognition.
- [deeplab](deeplab): deep labeling for semantic image segmentation. - [deeplab](deeplab): deep labeling for semantic image segmentation.
- [delf](delf): deep local features for image matching and retrieval. - [delf](delf): deep local features for image matching and retrieval.
- [differential_privacy](differential_privacy): differential privacy for training
data.
- [domain_adaptation](domain_adaptation): domain separation networks. - [domain_adaptation](domain_adaptation): domain separation networks.
- [fivo](fivo): filtering variational objectives for training generative - [fivo](fivo): filtering variational objectives for training generative
sequence models. sequence models.
......
# Deep Learning with Differential Privacy
All of the content from this directory has moved to the [tensorflow/privacy](https://github.com/tensorflow/privacy) repository, which is dedicated to learning with (differential) privacy.
# MorphNet project has moved to https://github.com/google-research/morph-net.
**This directory contains the deprecated version of MorphNet. Please use the [new repo](https://github.com/google-research/morph-net).**
# Regularizers Framework
[TOC]
## Goal
The goal of this framework is to facilitate building sparsifying regularizers
for deep networks. A regularizer targets a certain cost (***targeted
cost***), such as the FLOP cost of inference, model size, latency, memory
footprint, etc.
In order to form such a regularizer, we traverse the TensorFlow graph and find
the ops that contribute to the *targeted cost*. For each op, we apply a
sparsifying regularizer that induces sparsity *among the activations*. The
sparsifying regularizer of each activation is weighted by its marginal
contribution to the *targeted cost*.
Calculating this weight may be a nontrivial task. For example, for a fully
connected layer the FLOP cost is proportional to the number of inputs times
the number of outputs, which means that the marginal cost of each output
is proportional to the number of inputs. Some of the inputs may have been
already regularized away, which means that the calculation of one op's FLOP
regularizer depends on the regularization of the output of other ops. Moreover,
if an op receives its input a concatenation or a sum of several other ops,
figuring out the regularizer requires some bookkeeping.
The goal of this framework is to take care of this bookkeeping in a general way,
to facilitate building a wide variety of regularizers, targeting a wide variety
of *targeted costs*, with little effort and less opportunities to err. In
what follows we outline the framework, building it from the bottom up: From a
single activation all the way to a full complex network.
## `OpRegularizers` and how they are assigned
### The `OpRegularizer` interface
`OpRegularizer` is the most primitive element in the framework. An
`OpRegularizer` refers to TensorFlow op, and has two methods,
`regularization_vector` and `alive_vector`, both return `tf.Tensor`s or rank 1
(vectors). `regularization_vector` is of type float, and its `i`-th entry is the
regularizer of the `i`-th activation of the op the `OpRegularizer` refers to.
In order to regularize away that activation, one would need to add the `i`-th
entry of `regularization_vector`, multiplied by some coefficient, to the
training loss. The stronger we want to penalize it, the larger the coefficient
is. Assuming that the regularizer is of sparsifying nature (e.g. L1 norm), with
a large enough coefficient, the `i`-th activation will eventually vanish.
Loosely speaking, if we were to target the total number of activations in the
network, we would add the sum of all `regularization_vector`s from all
`OpRegularizer` to the training loss.
Since `OpRegularizer` is an abstract interface, with no awareness of the nature
of regularization used, the decision when an activation can be considered alive
is also deferred to `OpRegularizer`, via the `alive_vector` method. The `i`-th
entry evaluates to a boolean that indicates whether the activation is alive.
```python
class OpRegularizer(object):
@abc.abstractproperty
def regularization_vector(self):
"""Returns a vector of floats with a regularizer for each activation."""
pass
@abc.abstractproperty
def alive_vector(self):
"""Returns a bool vector indicating which activations are alive."""
pass
```
As an example, we can consider a fully connected layer that has `m` inputs and
`n` outputs. The layer is represented by an `m * n` matrix, and one way to
impose sparsifying regularizer on the `i`-th output is by grouping all weights
associated with it into a group LASSO regularizer, such as the L2 norm of the
`i`-th row of the matrix. That would therefore be the `i`-th entry of the
`regularization_vector`.
When such a regularization is added to the training loss, the L2 norms of the
rows of the matrix tend to form a bimodal distribution with one peak near "zero"
(up to numerical noise), another peak away from zero, and a void in between. A
natural way to detemine whether the `i`-th activation is alive is thus by
comparing the `i`-th entry of the `regularization_vector` to some threshold that
lies in that void: If it's above the threshold, it's alive.
![HistogramOfActivationSTDs](../g3doc/histogram.png "Typical bimodal distribution of
the standatd deviations of the activations of a convolutional layer when a
sparsifying regularizer is applied.")
There are ops that are not regularized, such as constants, or the input to the
network. For an un-regularized op, the `OpRegularizer` is set to `None`, which
implies an all-zero `regularization_vector` and an all-True `alive_vector`.
### Rules for assigning `OpRegularizer`s to ops
As we traverse the TensorFlow graph, we assign an `OpRegularizer` to each op we
encounter according to the set of rules outlined in this section. We first
explain "default rules", rules that address propagating `OpRegularizers` across
connections in the TensorFlow graph. Then we discuss client-specified rules,
which can augment and override the default rules.
#### Pass-through ops
Many TensorFlow ops inherit the `OpRegularizer` of their input. These are ops
that:
* Don't change the alive status of activations.
* The only way an activation can be eliminated form their output is if
it's eliminated from their input.
An example is adding a bias to the output of a convolution. After adding a bias
to it, an activation will be alive (that is, have nonzero variance) if and only
if was alive before adding the bias. If we want to regularize away an activation
at the output of a `BiasAdd` op, the only way to do so is to penalize the same
activation in the preceding convolution.
Since both the `regularization_vector` and the `alive_vector` of such an op is
identical to those of its input, so is the entire `OpRegularizer`. We refer to
such ops as *pass-through* ops. Shape-preserving unary ops (e.g. ReLU) are
generally *pass-through*, but some binary ops are too. In our framework ops are
assumed to be *pass-through* by default. Exceptions to this rule are discussed
below.
#### Grouping
When learning the number of outputs of ops in a TensorFlow graph, some ops are
constrained to maintain the same number of outputs as others. Elementwise
ops that are performed on two (or more) tensors, such as addition,
multiplication, or maximum, constrain their input tensors to have the same size.
Common use cases are attention maps, recurrent models, and residual connections.
An example of a residual connection is illustrated in the diagram below. It
would be problematic if the activations of op1 and op2 didn't live or die
together. For example, if the `i`-th activation of op1 is alive but for op2 it's
dead, we still cannot eliminate the `i`-th activation from op2 without breaking
the topology of the network.
![ResidualConnections](../g3doc/grouping.png "Ops with residual connections"
)
In our framework we choose to impose preservation of the topology. That is, ops
that are connected with addition (or other elementwise binary ops) are
constrained to have their activations live and die together. The `i`-th
activations of each of those ops are grouped together in a single LASSO group.
The default grouping mechanism is maximum for the `regularization_vector` and
elementwise logical OR for the `alive_vector`. To regularize away the `i`-th
element of the group one needs to penalize the maximum of `i`-th regularization
terms of all ops comprising the group, and to declare the entire `i`-th group
dead, the `i`-th element in all ops comprising the group must be dead. However
the framework admits other forms of grouping, and user-defined grouping methods
can be easily plugged into it.
One property of the grouping, which may seem confusing initially, is that once
two (or more) `OpRegularizer`s are grouped, and the `OpRegularizer` of the
group is formed, the `OpRegularizer`s comprising the group are all 'replaced' by
the `OpRegularizer` of the group. For example, in the diagram above, the
`OpRegularizer`s of op1 and op2 have to be grouped. Therefore if the `i`-th
output of op1 is alive and that of op2 is dead, and we use the default grouping
described above, the `i`-th output of the group is *alive*.
Now, consider op4, which receives only op2 as input. From the point of view of
op4, the `i`-th activation of op2 must be considered *alive*, even though the
original op2 regularizer deemed it *dead*. This is because we already know that
we won't be able to do away with the `i`-th activation of op2 - it is tied to
the one of op1, which is alive. Therefore after the grouping, the
`OpRegularizer`s of all constituents of the group are henceforth *replaced* by
the `OpRegularizer` of the group.
#### Concatenation
Often outputs of several ops are concatenated to a single tensor. For example,
in Inception networks, the outputs of various convolutional 'towers' are
concatenated along the channels dimension. In such a case, it is obvious that
the `regularization_vector` (`alive_vector`) of the concatenation is a
concatenation of the `regularization_vector` (`alive_vector`) of the
concatenated ops.
Similarly to the logic of grouping, once the concatenation of the
`OpRegularizer`s has happened, the concatenated `OpRegularizer`s cease to exist
and are replaced by slices of their concatenation. For example if op1 has 3
outputs and op2 has 4, and op3 is their concatenation, op3 has 7 outputs. After
the concatenation, the `alive_vector` of op1 will be a slice (from index 0 to
index 2) of the `alive_vector` of op3, whereas for op2 it will be another slice
(index from 3 to index 6).
If op3 is later grouped with op4, as happens in Inception ResNet architectures,
a group will be formed, and the `alive_vector` of op1 will henceforth be a slice
(index from 0 to index 2) of the `alive_vector` of *the new group*. This is for
the same reasons as the ones described in the section above.
#### Client-specified rules
The client code of the framework has the opportunity to specify rules for
creating `OpRegularizers`. For example, for ops of type `MatMul`, which are the
common implementation of fully-connected layers, the client can choose to assign
group LASSO regularizers similar to the one described above. Typically the
client code would choose to do that for 'interesting' ops, like convolutions and
fully-connected layers, but the choice of rules is ultimately deferred to the
client code.
The client code may also choose to override the *default rules*. Ops are
considered *pass-through* by default, and obviously there are cases where this
is not true, such as reshaping, slicing, sparse maxtrix operations etc.
TensorFlow is much too expressive for us to be able to anticipate every usage
pattern of its ops and to properly regularize them. The set of default rules
cover most of the common published convolutional networks, but we do not presume
to cover *all* networks. More complex networks may require adding some custom
rules.
### OpRegularizerManager
`OpRegularizerManager` is the class responsible for assigning an `OpRegularizer`
to each op in the TensorFlow graph. Its constructor crawls the TensorFlow graph,
starting from the ops listed in the `ops` argument (typically the output of the
network), recursively, and assigns `OpRegularizer`s to each op encountered. Once
the object is constructed, it provides read-only methods that allow querying the
`OpRegularizer` for any op that was encountered during construction, and a list
of the latter ops for convenience.
```python
class OpRegularizerManager(object):
"""Assigns OpRegularizers to ops in a graph and bookkeeps the mapping."""
def __init__(self, ops, op_regularizer_factory_dict,
create_grouping_regularizer=None):
"""Creates an instance.
Args:
ops: A list of tf.Operation. An OpRegularizer will be created for all the
ops in `ops`, and recursively for all ops they depend on via data
dependency. Typically `ops` would contain a single tf.Operation, which
is the output of the network.
op_regularizer_factory_dict: A dictionary, where the keys are strings
representing TensorFlow Op types, and the values are callables that
create the respective OpRegularizers. For every op encountered during
the recursion, if op.type is in op_regularizer_factory_dict, the
respective callable will be used to create an OpRegularizer. The
signature of the callables is the following args;
op; a tf.Operation for which to create a regularizer.
opreg_manager; A reference to an OpRegularizerManager object. Can be
None if the callable does not need access to OpRegularizerManager.
create_grouping_regularizer: A callable that has the signature of
grouping_regularizers.MaxGroupingRegularizer's constructor. Will be
called whenever a grouping op (see _GROUPING_OPS) is encountered.
Defaults to MaxGroupingRegularizer if None.
Raises:
ValueError: If ops is not a list.
"""
...
def get_regularizer(self, op):
"""Returns the OpRegularizer object pertaining to `op`.
Args:
op: a tf.Operation object.
Returns:
An OpRegularizer object, or None if `op` does not have one.
Raises:
ValueError: The OpRegularizerManager object did not encounter `op` when
it was constructed and the grpah was traversed, and thus does not know
the answer.
"""
...
@property
def ops(self):
"""Returns all tf.Operations for which `get_regularizer` is known."""
...
```
As the constructor crawls the graph, it invokes the following set of rules, for
any op encountered:
* If `op_regularizer_factory_dict` has a rule on how to create an
`OpRegularizer` for the type of the op encountered, invoke the rule. These
are the user-specified rules. Otherwise:
* If the op has no inputs, return `None`. Examples are constants and variables.
Otherwise:
* If the ops is concatenation, invoke the rule for concatenation decribed above.
Otherwise:
* If the op has more than one regularized input (that is, input that has a non-
`None` `OpRegularizer`, perform grouping. Being conservative, we first check if
the op is whitelisted for being a grouping op (elemetwise addition, subtraction
etc). Otherwise:
* The op is a *pass-through*. That is, its OpRegularizer is the same as of its
input.
The implementaiton is recursive: We start from the output nodes(s) of the graph.
To build an `OpRegularizer` for each op, we need to know the `OpRegularizer` of
its inputs, so we make a recursive call to find out those, and so on.
<!-- TODO: Explain how to change the grouping mechanism. -->
## Network Regularizers
A `NetworkRegularizer` object targets a certain *targeted cost* of an entire
network. Its interface is:
```python
class NetworkRegularizer(object):
"""An interface for Network Regularizers."""
@abc.abstractmethod
def get_regularization_term(self, ops=None):
"""Compute the FluidNet regularization term.
Args:
ops: A list of tf.Operation. If specified, only the regularization term
associated with the ops in `ops` will be returned. Otherwise, all
relevant ops in the default TensorFlow graph will be included.
Returns:
A tf.Tensor scalar of floating point type that evaluates to the
regularization term.
"""
pass
@abc.abstractmethod
def get_cost(self, ops=None):
"""Calculates the cost targeted by the Regularizer.
Args:
ops: A list of tf.Operation. If specified, only the cost pertaining to the
ops in the `ops` will be returned. Otherwise, all relevant ops in the
default TensorFlow graph will be included.
Returns:
A tf.Tensor scalar that evaluates to the cost.
"""
pass
```
The TensorFlow scalar returned by `get_cost` evaluates to the *targeted
cost*, and is typically used for monitoring (e.g. displaying it in
TensorBoard). The scalar returned by `get_regularization_term` is the one that
has to be added to the training loss, multiplied by a coefficient controlling
its strength.
`OpRegularizerManager` and the `OpRegularizer`s it provides for ops in the graph
are intended to facilitate easy implementation of `NetworkRegularizer`s. We
exemplify it here in the context of targeting FLOPs for a convolutional network,
but the same principles apply for other *targeted costs*.
Most of the consumption of FLOPs in convolutional networks happens in the
convolutions. As a first approximation, we can neglect the FLOP impact of the
other ops in the graph, even though the framework readily allows including the
FLOP contribution of all ops, even the ones that have negligible cost.
Within this approximation, in order to build the FLOP `NetworkRegularizer`, its
constructor needs to:
* Crawl the graph, starting from the output of the network, and find all
convolution ops on which the output depends.
* For each of these convolution ops, create an `OpRegularizer`.
* Find the `OpRegularizer` of the *input* of each convolution op.
* Implement Eq. (6) in the [MorphNet paper](https://arxiv.org/abs/1711.06798) to
calculate the total FLOP cost of all convolutions, and an equation similar to
Eq. (9) to calcluate the respective regularization term. We say 'similar'
because Eq. (9) refers to a specific type of regularization, where the
`regularization_vector` of a convolution is the absolute value of the respective
batch-norm gamma vector. However the exact nature of the `regularization_vector`
is delegated to the `OpRegularizer`.
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""OpRegularizers that concatenate and slice other OpRegularizers.
When we have a concatenation op in the network, which concatenates several
tensors, the regularizers of the concatenated ops (that is, the
regularization_vector-s and the alive_vector-s) should be concatenated as well.
Slicing is the complementary op - if regularizers Ra and Rb were concatenated
into a regularizer Rc, Ra and Rb can be obtained form Rc by slicing.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from morph_net.framework import generic_regularizers
class ConcatRegularizer(generic_regularizers.OpRegularizer):
"""An OpRegularizer that concatenates others, to reflect a Concat op."""
def __init__(self, regularizers_to_concatenate):
for r in regularizers_to_concatenate:
if not generic_regularizers.dimensions_are_compatible(r):
raise ValueError('Bad regularizer: dimensions are not compatible')
self._alive_vector = tf.concat(
[r.alive_vector for r in regularizers_to_concatenate], 0)
self._regularization_vector = tf.concat(
[r.regularization_vector for r in regularizers_to_concatenate], 0)
@property
def regularization_vector(self):
return self._regularization_vector
@property
def alive_vector(self):
return self._alive_vector
class SlicingReferenceRegularizer(generic_regularizers.OpRegularizer):
"""An OpRegularizer that slices a segment of another regularizer.
This is useful to complement the ConcatRegularizer. For example, suppose that
we have two ops, one with 3 outputs (Op1) and the other with 4 outputs (Op2).
Each has own regularizer, Reg1 and Reg2.
Now suppose that a concat op concatenated Op1 and Op2 into OpC. Reg1 and Reg2
should be concatenated to RegC. To make the situation more complicated, RegC
was grouped in a group lasso with another op in the graph, resulting in RegG.
Whan happens next? All references to RegC should obviously be replaced by
RegG. But what about Reg1? The latter could be the first 3 outputs of RegG,
and Reg2 would be the 4 last outputs of RegG.
SlicingReferenceRegularizer is a regularizer that picks a segment of outputs
form an existing OpRegularizer. When OpRegularizers are concatenated, they
are replaced by SlicingReferenceRegularizer-s.
"""
def __init__(self, get_regularizer_to_slice, begin, size):
"""Creates an instance.
Args:
get_regularizer_to_slice: A callable, such that get_regularizer_to_slice()
returns an OpRegularizer that has to be sliced.
begin: An integer, where to begin the slice.
size: An integer, the length of the slice (so the slice ends at
begin + size
"""
self._get_regularizer_to_slice = get_regularizer_to_slice
self._begin = begin
self._size = size
self._alive_vector = None
self._regularization_vector = None
@property
def regularization_vector(self):
if self._regularization_vector is None:
regularizer_to_slice = self._get_regularizer_to_slice()
self._regularization_vector = tf.slice(
regularizer_to_slice.regularization_vector, [self._begin],
[self._size])
return self._regularization_vector
@property
def alive_vector(self):
if self._alive_vector is None:
regularizer_to_slice = self._get_regularizer_to_slice()
assert regularizer_to_slice is not self
self._alive_vector = tf.slice(regularizer_to_slice.alive_vector,
[self._begin], [self._size])
return self._alive_vector
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for framework.concat_and_slice_regularizers."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from morph_net.framework import concat_and_slice_regularizers
from morph_net.testing import op_regularizer_stub
class ConcatAndSliceRegularizersTest(tf.test.TestCase):
def setUp(self):
self._reg_vec1 = [0.1, 0.3, 0.6, 0.2]
self._alive_vec1 = [False, True, True, False]
self._reg_vec2 = [0.2, 0.4, 0.5]
self._alive_vec2 = [False, True, False]
self._reg1 = op_regularizer_stub.OpRegularizerStub(self._reg_vec1,
self._alive_vec1)
self._reg2 = op_regularizer_stub.OpRegularizerStub(self._reg_vec2,
self._alive_vec2)
def testConcatRegularizer(self):
concat_reg = concat_and_slice_regularizers.ConcatRegularizer(
[self._reg1, self._reg2])
with self.test_session():
self.assertAllEqual(self._alive_vec1 + self._alive_vec2,
concat_reg.alive_vector.eval())
self.assertAllClose(self._reg_vec1 + self._reg_vec2,
concat_reg.regularization_vector.eval(), 1e-5)
def testSliceRegularizer(self):
concat_reg = concat_and_slice_regularizers.SlicingReferenceRegularizer(
lambda: self._reg1, 1, 2)
with self.test_session():
self.assertAllEqual(self._alive_vec1[1:3],
concat_reg.alive_vector.eval())
self.assertAllClose(self._reg_vec1[1:3],
concat_reg.regularization_vector.eval(), 1e-5)
if __name__ == '__main__':
tf.test.main()
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Interface for MorphNet regularizers framework.
A subclasses of Regularizer represent a regularizer that targets a certain
quantity: Number of flops, model size, number of activations etc. The
Regularizer interface has two methods:
1. `get_regularization_term`, which returns a regularization term that should be
included in the total loss to target the quantity.
2. `get_cost`, the quantity itself (for example, the number of flops). This is
useful for display in TensorBoard, and later, to to provide feedback for
automatically tuning the coefficient that multplies the regularization term,
until the cost reaches (or goes below) its target value.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import abc
class OpRegularizer(object):
"""An interface for Op Regularizers.
An OpRegularizer object corresponds to a tf.Operation, and provides
a regularizer for the output of the op (we assume that the op has one output
of interest in the context of MorphNet).
"""
__metaclass__ = abc.ABCMeta
@abc.abstractproperty
def regularization_vector(self):
"""Returns a vector of floats, with regularizers.
The length of the vector is the number of "output activations" (call them
neurons, nodes, filters etc) of the op. For a convolutional network, it's
the number of filters (aka "depth"). For a fully-connected layer, it's
usually the second (and last) dimension - assuming the first one is the
batch size.
"""
pass
@abc.abstractproperty
def alive_vector(self):
"""Returns a vector of booleans, indicating which activations are alive.
(call them activations, neurons, nodes, filters etc). This vector is of the
same length as the regularization_vector.
"""
pass
class NetworkRegularizer(object):
"""An interface for Network Regularizers."""
__metaclass__ = abc.ABCMeta
@abc.abstractmethod
def get_regularization_term(self, ops=None):
"""Compute the regularization term.
Args:
ops: A list of tf.Operation objects. If specified, only the regularization
term associated with the ops in `ops` will be returned. Otherwise, all
relevant ops in the default TensorFlow graph will be included.
Returns:
A tf.Tensor scalar of floating point type that evaluates to the
regularization term (that should be added to the total loss, with a
suitable coefficient)
"""
pass
@abc.abstractmethod
def get_cost(self, ops=None):
"""Calculates the cost targeted by the Regularizer.
Args:
ops: A list of tf.Operation objects. If specified, only the cost
pertaining to the ops in the `ops` will be returned. Otherwise, all
relevant ops in the default TensorFlow graph will be included.
Returns:
A tf.Tensor scalar that evaluates to the cost.
"""
pass
def dimensions_are_compatible(op_regularizer):
"""Checks if op_regularizer's alive_vector matches regularization_vector."""
return op_regularizer.alive_vector.shape.with_rank(1).dims[
0].is_compatible_with(
op_regularizer.regularization_vector.shape.with_rank(1).dims[0])
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Regularizers that group other regularizers for residual connections.
An Elementwise operation between two tensors (addition, multiplication, maximum
etc) imposes a constraint of equality of the shapes of the constituents. For
example, if A, B are convolutions, and another op in the network
receives A + B as input, it means that the i-th output of A is tied to the i-th
output of B. Only if the i-th output was regularized away by the reguarizer in
both A and B can we discard the i-th activation in both.
Therefore we group the i-th output of A and the i-th output of B in a group
LASSO, a group for each i. The grouping methods can vary, and this file offers
several variants.
Residual connections, in ResNet or in RNNs, are examples where this kind of
grouping is needed.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from morph_net.framework import generic_regularizers
DEFAULT_THRESHOLD = 0.01
class MaxGroupingRegularizer(generic_regularizers.OpRegularizer):
"""A regularizer that groups others by taking their maximum."""
def __init__(self, regularizers_to_group):
"""Creates an instance.
Args:
regularizers_to_group: A list of generic_regularizers.OpRegularizer
objects.Their regularization_vector (alive_vector) are expected to be of
the same length.
Raises:
ValueError: regularizers_to_group is not of length 2 (TODO:
support arbitrary length if needed.
"""
_raise_if_length_is_not2(regularizers_to_group)
self._regularization_vector = tf.maximum(
regularizers_to_group[0].regularization_vector,
regularizers_to_group[1].regularization_vector)
self._alive_vector = tf.logical_or(regularizers_to_group[0].alive_vector,
regularizers_to_group[1].alive_vector)
@property
def regularization_vector(self):
return self._regularization_vector
@property
def alive_vector(self):
return self._alive_vector
class L2GroupingRegularizer(generic_regularizers.OpRegularizer):
r"""A regularizer that groups others by taking their L2 norm.
R_j = sqrt((\sum_i r_{ij}^2))
Where r_i is the i-th regularization vector, r_{ij} is its j-th element, and
R_j is the j-th element of the resulting regularization vector.
"""
def __init__(self, regularizers_to_group, threshold=DEFAULT_THRESHOLD):
"""Creates an instance.
Args:
regularizers_to_group: A list of generic_regularizers.OpRegularizer
objects.Their regularization_vector (alive_vector) are expected to be of
the same length.
threshold: A float. An group of activations will be considered alive if
its L2 norm is greater than `threshold`.
Raises:
ValueError: regularizers_to_group is not of length 2 (TODO:
support arbitrary length if needed.
"""
_raise_if_length_is_not2(regularizers_to_group)
self._regularization_vector = tf.sqrt((
lazy_square(regularizers_to_group[0].regularization_vector) +
lazy_square(regularizers_to_group[1].regularization_vector)))
self._alive_vector = self._regularization_vector > threshold
@property
def regularization_vector(self):
return self._regularization_vector
@property
def alive_vector(self):
return self._alive_vector
def _raise_if_length_is_not2(regularizers_to_group):
if len(regularizers_to_group) != 2:
raise ValueError('Currently only groups of size 2 are supported.')
def lazy_square(tensor):
"""Computes the square of a tensor in a lazy way.
This function is lazy in the following sense, for:
tensor = tf.sqrt(input)
will return input (and not tf.square(tensor)).
Args:
tensor: A `Tensor` of floats to compute the square of.
Returns:
The squre of the input tensor.
"""
if tensor.op.type == 'Sqrt':
return tensor.op.inputs[0]
else:
return tf.square(tensor)
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for framework.grouping_regularizers."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from absl.testing import parameterized
import numpy as np
import tensorflow as tf
from morph_net.framework import grouping_regularizers
from morph_net.testing import op_regularizer_stub
def _l2_reg_with_025_threshold(regularizers_to_group):
return grouping_regularizers.L2GroupingRegularizer(regularizers_to_group,
0.25)
class GroupingRegularizersTest(parameterized.TestCase, tf.test.TestCase):
# TODO: Add parametrized tests.
def setUp(self):
self._reg_vec1 = [0.1, 0.3, 0.6, 0.2]
self._alive_vec1 = [False, True, True, False]
self._reg_vec2 = [0.2, 0.4, 0.5, 0.1]
self._alive_vec2 = [False, True, False, True]
self._reg_vec3 = [0.3, 0.2, 0.0, 0.25]
self._alive_vec3 = [False, True, False, True]
self._reg1 = op_regularizer_stub.OpRegularizerStub(self._reg_vec1,
self._alive_vec1)
self._reg2 = op_regularizer_stub.OpRegularizerStub(self._reg_vec2,
self._alive_vec2)
self._reg3 = op_regularizer_stub.OpRegularizerStub(self._reg_vec3,
self._alive_vec3)
def testMaxGroupingRegularizer(self):
group_reg = grouping_regularizers.MaxGroupingRegularizer(
[self._reg1, self._reg2])
with self.test_session():
self.assertAllEqual(
[x or y for x, y in zip(self._alive_vec1, self._alive_vec2)],
group_reg.alive_vector.eval())
self.assertAllClose(
[max(x, y) for x, y in zip(self._reg_vec1, self._reg_vec2)],
group_reg.regularization_vector.eval(), 1e-5)
def testL2GroupingRegularizer(self):
group_reg = grouping_regularizers.L2GroupingRegularizer(
[self._reg1, self._reg2], 0.25)
expcted_reg_vec = [
np.sqrt((x**2 + y**2))
for x, y in zip(self._reg_vec1, self._reg_vec2)
]
with self.test_session():
self.assertAllEqual([x > 0.25 for x in expcted_reg_vec],
group_reg.alive_vector.eval())
self.assertAllClose(expcted_reg_vec,
group_reg.regularization_vector.eval(), 1e-5)
@parameterized.named_parameters(
('Max', grouping_regularizers.MaxGroupingRegularizer),
('L2', _l2_reg_with_025_threshold))
def testOrderDoesNotMatter(self, create_reg):
group12 = create_reg([self._reg1, self._reg2])
group13 = create_reg([self._reg1, self._reg3])
group23 = create_reg([self._reg2, self._reg3])
group123 = create_reg([group12, self._reg3])
group132 = create_reg([group13, self._reg2])
group231 = create_reg([group23, self._reg1])
with self.test_session():
self.assertAllEqual(group123.alive_vector.eval(),
group132.alive_vector.eval())
self.assertAllEqual(group123.alive_vector.eval(),
group231.alive_vector.eval())
self.assertAllClose(group123.regularization_vector.eval(),
group132.regularization_vector.eval())
self.assertAllClose(group123.regularization_vector.eval(),
group231.regularization_vector.eval())
if __name__ == '__main__':
tf.test.main()
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""A class for managing OpRegularizers.
OpRegularizerManager creates the required regulrizers and manages the
association between ops and their regularizers. OpRegularizerManager handles the
logic associated with the graph topology:
- Concatenating tensors is reflected in concatenating their regularizers.
- Skip-connections (aka residual connections), RNNs and other structures where
the shapes of two (or more) tensors are tied together are reflected in
grouping their regularizers together.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections
import logging
import tensorflow as tf
from morph_net.framework import concat_and_slice_regularizers
from morph_net.framework import generic_regularizers
from morph_net.framework import grouping_regularizers
# When an op has two (or more) inputs, that haveregularizers, the latter need to
# be grouped. _GROUPING_OPS is a whitelist of ops that are allowed to group, as
# a form of verification of the correctness of the code. The list is not
# exhaustive, feel free to add other grouping ops as needed.
_GROUPING_OPS = ('Add', 'Sub', 'Mul', 'Div', 'Maximum', 'Minimum',
'SquaredDifference', 'RealDiv') # TODO: Is Div needed?
# Ops that are not pass-through, that necessarily modify the regularizer.
# These are the Ops that should not have an regularizer that is identifical to
# one of its input. When we recursively look for regularizers along the graph
# the recursion will always stop at these Ops even if no regularizer factory
# is provided, and never assume that they pass the regularizer of their input
# through.
NON_PASS_THROUGH_OPS = ('Conv2D', 'Conv2DBackpropInput', 'MatMul')
def _remove_nones_and_dups(items):
result = []
for i in items:
if i is not None and i not in result:
result.append(i)
return result
def _raise_type_error_if_not_operation(op):
if not isinstance(op, tf.Operation):
raise TypeError('\'op\' must be of type tf.Operation, not %s' %
str(type(op)))
class OpRegularizerManager(object):
"""A class for managing OpRegularizers."""
# Public methods -------------------------------------------------------------
def __init__(self, ops, op_regularizer_factory_dict,
create_grouping_regularizer=None):
"""Creates an instance.
Args:
ops: A list of tf.Operation-s. An OpRegularizer will be created for all
the ops in `ops`, and recursively for all ops they depend on via data
dependency. Typically `ops` would contain a single tf.Operation, which
is the output of the network.
op_regularizer_factory_dict: A dictionary, where the keys are strings
representing TensorFlow Op types, and the values are callables that
create the respective OpRegularizers. For every op encountered during
the recursion, if op.type is in op_regularizer_factory_dict, the
respective callable will be used to create an OpRegularizer. The
signature of the callables is the following args;
op; a tf.Operation for which to create a regularizer.
opreg_manager; A reference to an OpRegularizerManager object. Can be
None if the callable does not need access to OpRegularizerManager.
create_grouping_regularizer: A callable that has the signature of
grouping_regularizers.MaxGroupingRegularizer's constructor. Will be
called whenever a grouping op (see _GROUPING_OPS) is encountered.
Defaults to MaxGroupingRegularizer if None.
Raises:
ValueError: If ops is not a list.
"""
self._constructed = False
if not isinstance(ops, list):
raise ValueError(
'Input %s ops is not a list. Should probably use []' % str(ops))
self._op_to_regularizer = {}
self._regularizer_to_ops = collections.defaultdict(list)
self._op_regularizer_factory_dict = op_regularizer_factory_dict
for op_type in NON_PASS_THROUGH_OPS:
if op_type not in self._op_regularizer_factory_dict:
self._op_regularizer_factory_dict[op_type] = lambda x, y: None
self._create_grouping_regularizer = (
create_grouping_regularizer or
grouping_regularizers.MaxGroupingRegularizer)
self._visited = set()
for op in ops:
self._get_regularizer(op)
self._constructed = True
def get_regularizer(self, op):
"""Looks up or creates an OpRegularizer for a tf.Operation.
Args:
op: A tf.Operation.
- If `self` has an OpRegularizer for `op`, it will be returned.
Otherwise:
- If called before construction of `self` was completed (that is, from the
constructor), an attempt to create an OpRegularizer for `op` will be made.
Otherwise:
- If called after contstruction of `self` was completed, an exception will
be raised.
Returns:
An OpRegularizer for `op`. Can be None if `op` is not regularized (e.g.
`op` is a constant).
Raises:
RuntimeError: If `self` object has no OpRegularizer for `op` in its
lookup table, and the construction of `self` has already been completed
(because them `self` is immutable and an OpRegularizer cannot be
created).
"""
try:
return self._op_to_regularizer[op]
except KeyError:
if self._constructed:
raise ValueError('Op %s does not have a regularizer.' % op.name)
else:
return self._get_regularizer(op)
@property
def ops(self):
return self._op_to_regularizer.keys()
# ---- Public MUTABLE methods ------------------------------------------------
#
# These methods are intended to be called by OpRegularizer factory functions,
# in the constructor of OpRegularizerManager. OpRegularizerManager is
# immutable after construction, so calling these methods after construction
# has been completed will raise an exception.
def group_and_replace_regularizers(self, regularizers):
"""Groups a list of OpRegularizers and replaces them by the grouped one.
Args:
regularizers: A list of OpRegularizer objects to be grouped.
Returns:
An OpRegularizer object formed by the grouping.
Raises:
RuntimeError: group_and_replace_regularizers was called affter
construction of the OpRegularizerManager object was completed.
"""
if self._constructed:
raise RuntimeError('group_and_replace_regularizers can only be called '
'before construction of the OpRegularizerManager was '
'completed.')
grouped = self._create_grouping_regularizer(regularizers)
# Replace all the references to the regularizers by the new grouped
# regularizer.
for r in regularizers:
self._replace_regularizer(r, grouped)
return grouped
# Private methods ------------------------------------------------------------
def _get_regularizer(self, op):
"""Fetches the regularizer of `op` if exists, creates it otherwise.
This function calls itself recursively, directly or via _create_regularizer
(which in turn calls _get_regularizer). It performs DFS along the data
dependencies of the graph, and uses a self._visited set to detect loops. The
use of self._visited makes it not thread safe, but _get_regularizer is a
private method that is supposed to only be called form the constructor, so
execution in multiple threads (for the same object) is not expected.
Args:
op: A Tf.Operation.
Returns:
An OpRegularizer that corresponds to `op`, or None if op does not have
a regularizer (e. g. it's a constant op).
"""
_raise_type_error_if_not_operation(op)
if op not in self._op_to_regularizer:
if op in self._visited:
# In while loops, the data dependencies form a loop.
# TODO: RNNs have "legit" loops - will this still work?
return None
self._visited.add(op)
regularizer = self._create_regularizer(op)
self._op_to_regularizer[op] = regularizer
self._regularizer_to_ops[regularizer].append(op)
# Make sure that there is a regularizer (or None) for every op on which
# `op` depends via data dependency.
for i in op.inputs:
self._get_regularizer(i.op)
self._visited.remove(op)
return self._op_to_regularizer[op]
def _create_regularizer(self, op):
"""Creates an OpRegularizer for `op`.
Args:
op: A Tf.Operation.
Returns:
An OpRegularizer that corresponds to `op`, or None if op does not have
a regularizer.
Raises:
RuntimeError: Grouping is attempted at op which is not whitelisted for
grouping (in _GROUPING_OPS).
"""
# First we see if there is a factory function for creating the regularizer
# in the op_regularizer_factory_dict (supplied in the constructor).
if op.type in self._op_regularizer_factory_dict:
regularizer = self._op_regularizer_factory_dict[op.type](op, self)
if regularizer is None:
logging.warning('Failed to create regularizer for %s.', op.name)
else:
logging.info('Created regularizer for %s.', op.name)
return regularizer
# Unless overridden in op_regularizer_factory_dict, we assume that ops
# without inputs have no regularizers. These are 'leaf' ops, typically
# constants and variables.
if not op.inputs:
return None
if op.type == 'ConcatV2':
return self._create_concat_regularizer(op)
inputs_regularizers = _remove_nones_and_dups(
[self._get_regularizer(i.op) for i in op.inputs])
# Ops whose inputs have no regularizers, and that are not in
# op_regularizer_factory_dict, have no regularizer either (think of ops that
# only involve constants as an example).
if not inputs_regularizers:
return None
# Ops that have one input with a regularizer, and are not in
# op_regularizer_factory_dict, are assumed to be pass-through, that is, to
# carry over the regularizer of their inputs. Examples:
# - Unary ops, such as as RELU.
# - BiasAdd, or similar ops, that involve a constant/variable and a
# regularized op (e.g. the convolution that comes before the bias).
elif len(inputs_regularizers) == 1:
return inputs_regularizers[0]
# Group if we have more than one regularizer in the inputs of `op` and if it
# is white-listed for grouping.
elif op.type in _GROUPING_OPS:
return self.group_and_replace_regularizers(inputs_regularizers)
raise RuntimeError('Grouping is attempted at op which is not whitelisted '
'for grouping: %s' % str(op.type))
def _create_concat_regularizer(self, concat_op):
"""Creates an OpRegularizer for a concat op.
Args:
concat_op: A tf.Operation of type ConcatV2.
Returns:
An OpRegularizer for `concat_op`.
"""
# We omit the last input, because it's the concat dimension. Others are
# the tensors to be concatenated.
input_ops = [i.op for i in concat_op.inputs[:-1]]
regularizers_to_concat = [self._get_regularizer(op) for op in input_ops]
# If all inputs have no regularizer, so does the concat op.
if regularizers_to_concat == [None] * len(regularizers_to_concat):
return None
offset = 0
# Replace the regularizers_to_concat by SlicingReferenceRegularizer-s that
# slice the concatenated regularizer.
ops_to_concat = []
for r, op in zip(regularizers_to_concat, input_ops):
if r is None:
length = op.outputs[0].shape.as_list()[-1]
offset += length
ops_to_concat.append(self._ConstantOpReg(length))
else:
length = tf.shape(r.alive_vector)[0]
slice_ref = concat_and_slice_regularizers.SlicingReferenceRegularizer(
lambda: self._get_regularizer(concat_op), offset, length)
offset += length
self._replace_regularizer(r, slice_ref)
ops_to_concat.append(r)
# Create the concatenated regularizer itself.
return concat_and_slice_regularizers.ConcatRegularizer(ops_to_concat)
def _replace_regularizer(self, source, target):
"""Replaces `source` by 'target' in self's lookup tables."""
for op in self._regularizer_to_ops[source]:
assert self._op_to_regularizer[op] is source
self._op_to_regularizer[op] = target
self._regularizer_to_ops[target].append(op)
del self._regularizer_to_ops[source]
class _ConstantOpReg(generic_regularizers.OpRegularizer):
"""A class with the constant alive property, and zero regularization."""
def __init__(self, size):
self._regularization_vector = tf.zeros(size)
self._alive_vector = tf.cast(tf.ones(size), tf.bool)
@property
def regularization_vector(self):
return self._regularization_vector
@property
def alive_vector(self):
return self._alive_vector
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for op_regularizer_manager."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from absl.testing import parameterized
import numpy as np
import tensorflow as tf
from morph_net.framework import op_regularizer_manager as orm
from morph_net.testing import op_regularizer_stub
layers = tf.contrib.layers
def _get_op(name):
return tf.get_default_graph().get_operation_by_name(name)
class TestOpRegularizerManager(parameterized.TestCase, tf.test.TestCase):
def setUp(self):
tf.reset_default_graph()
tf.set_random_seed(12)
np.random.seed(665544)
def _batch_norm_scope(self):
params = {
'trainable': True,
'normalizer_fn': layers.batch_norm,
'normalizer_params': {
'scale': True
}
}
with tf.contrib.framework.arg_scope([layers.conv2d], **params) as sc:
return sc
@parameterized.named_parameters(('Batch_no_par1', True, False, 'conv1'),
('Batch_par1', True, True, 'conv1'),
('NoBatch_no_par1', False, False, 'conv1'),
('NoBatch_par2', False, True, 'conv2'),
('Batch_no_par2', True, False, 'conv2'),
('Batch_par2', True, True, 'conv2'),
('Batch_par3', True, True, 'conv3'),
('NoBatch_par3', False, True, 'conv3'),
('NoBatch_no_par3', False, False, 'conv3'))
def testSimpleOpGetRegularizer(self, use_batch_norm, use_partitioner, scope):
# Tests the alive patern of the conv and relu ops.
# use_batch_norm: A Boolean. Inidcats if batch norm should be used.
# use_partitioner: A Boolean. Inidcats if a fixed_size_partitioner should be
# used.
# scope: A String. with the scope to test.
sc = self._batch_norm_scope() if use_batch_norm else []
partitioner = tf.fixed_size_partitioner(2) if use_partitioner else None
with tf.contrib.framework.arg_scope(sc):
with tf.variable_scope(tf.get_variable_scope(), partitioner=partitioner):
final_op = op_regularizer_stub.build_model()
op_reg_manager = orm.OpRegularizerManager([final_op],
op_regularizer_stub.MOCK_REG_DICT)
expected_alive = op_regularizer_stub.expected_alive()
with self.test_session():
conv_reg = op_reg_manager.get_regularizer(_get_op(scope + '/Conv2D'))
self.assertAllEqual(expected_alive[scope],
conv_reg.alive_vector.eval())
relu_reg = op_reg_manager.get_regularizer(_get_op(scope + '/Relu'))
self.assertAllEqual(expected_alive[scope],
relu_reg.alive_vector.eval())
@parameterized.named_parameters(('Batch_no_par', True, False),
('Batch_par', True, True),
('NoBatch_no_par', False, False),
('NoBatch_par', False, True))
def testConcatOpGetRegularizer(self, use_batch_norm, use_partitioner):
sc = self._batch_norm_scope() if use_batch_norm else []
partitioner = tf.fixed_size_partitioner(2) if use_partitioner else None
with tf.contrib.framework.arg_scope(sc):
with tf.variable_scope(tf.get_variable_scope(), partitioner=partitioner):
final_op = op_regularizer_stub.build_model()
op_reg_manager = orm.OpRegularizerManager([final_op],
op_regularizer_stub.MOCK_REG_DICT)
expected_alive = op_regularizer_stub.expected_alive()
expected = np.logical_or(expected_alive['conv4'],
expected_alive['concat'])
with self.test_session():
conv_reg = op_reg_manager.get_regularizer(_get_op('conv4/Conv2D'))
self.assertAllEqual(expected, conv_reg.alive_vector.eval())
relu_reg = op_reg_manager.get_regularizer(_get_op('conv4/Relu'))
self.assertAllEqual(expected, relu_reg.alive_vector.eval())
@parameterized.named_parameters(('Concat_5', True, 5),
('Concat_7', True, 7),
('Add_6', False, 6))
def testGetRegularizerForConcatWithNone(self, test_concat, depth):
image = tf.constant(0.0, shape=[1, 17, 19, 3])
conv2 = layers.conv2d(image, 5, [1, 1], padding='SAME', scope='conv2')
other_input = tf.add(
tf.identity(tf.constant(3.0, shape=[1, 17, 19, depth])), 3.0)
# other_input has None as regularizer.
concat = tf.concat([other_input, conv2], 3)
output = tf.add(concat, concat, name='output_out')
op = concat.op if test_concat else output.op
op_reg_manager = orm.OpRegularizerManager([output.op],
op_regularizer_stub.MOCK_REG_DICT)
expected_alive = op_regularizer_stub.expected_alive()
with self.test_session():
alive = op_reg_manager.get_regularizer(op).alive_vector.eval()
self.assertAllEqual([True] * depth, alive[:depth])
self.assertAllEqual(expected_alive['conv2'], alive[depth:])
@parameterized.named_parameters(('add', tf.add),
('div', tf.divide),
('mul', tf.multiply),
('max', tf.maximum),
('min', tf.minimum),
('l2', tf.squared_difference))
def testGroupingOps(self, tested_op):
th, size = 0.5, 11
image = tf.constant(0.5, shape=[1, 17, 19, 3])
conv1 = layers.conv2d(image, 5, [1, 1], padding='SAME', scope='conv1')
conv2 = layers.conv2d(image, 5, [1, 1], padding='SAME', scope='conv2')
res = tested_op(conv1, conv2)
reg = {'conv1': np.random.random(size), 'conv2': np.random.random(size)}
def regularizer(conv_op, manager=None):
del manager # unused
for prefix in ['conv1', 'conv2']:
if conv_op.name.startswith(prefix):
return op_regularizer_stub.OpRegularizerStub(
reg[prefix], reg[prefix] > th)
op_reg_manager = orm.OpRegularizerManager([res.op], {'Conv2D': regularizer})
with self.test_session():
alive = op_reg_manager.get_regularizer(res.op).alive_vector.eval()
self.assertAllEqual(alive,
np.logical_or(reg['conv1'] > th, reg['conv2'] > th))
if __name__ == '__main__':
tf.test.main()
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Helpers for Network Regularizers that are bilinear in their inputs/outputs.
Examples: The number of FLOPs and the number weights of a convolution are both
a bilinear expression in the number of its inputs and outputs.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from morph_net.framework import generic_regularizers
_CONV2D_OPS = ('Conv2D', 'Conv2DBackpropInput', 'DepthwiseConv2dNative')
_SUPPORTED_OPS = _CONV2D_OPS + ('MatMul',)
def _raise_if_not_supported(op):
if not isinstance(op, tf.Operation):
raise ValueError('conv_op must be a tf.Operation, not %s' % type(op))
if op.type not in _SUPPORTED_OPS:
raise ValueError('conv_op must be a Conv2D or a MatMul, not %s' % op.type)
def _get_conv_filter_size(conv_op):
assert conv_op.type in _CONV2D_OPS
conv_weights = conv_op.inputs[1]
filter_shape = conv_weights.shape.as_list()[:2]
return filter_shape[0] * filter_shape[1]
def flop_coeff(op):
"""Computes the coefficient of number of flops associated with a convolution.
The FLOPs cost of a convolution is given by C * output_depth * input_depth,
where C = 2 * output_width * output_height * filter_size. The 2 is because we
have one multiplication and one addition for each convolution weight and
pixel. This function returns C.
Args:
op: A tf.Operation of type 'Conv2D' or 'MatMul'.
Returns:
A float, the coefficient that when multiplied by the input depth and by the
output depth gives the number of flops needed to compute the convolution.
Raises:
ValueError: conv_op is not a tf.Operation of type Conv2D.
"""
_raise_if_not_supported(op)
if op.type in _CONV2D_OPS:
# Looking at the output shape makes it easy to automatically take into
# account strides and the type of padding.
if op.type == 'Conv2D' or op.type == 'DepthwiseConv2dNative':
shape = op.outputs[0].shape.as_list()
else: # Conv2DBackpropInput
# For a transposed convolution, the input and the output are swapped (as
# far as shapes are concerned). In other words, for a given filter shape
# and stride, if Conv2D maps from shapeX to shapeY, Conv2DBackpropInput
# maps from shapeY to shapeX. Therefore wherever we use the output shape
# for Conv2D, we use the input shape for Conv2DBackpropInput.
shape = _get_input(op).shape.as_list()
size = shape[1] * shape[2]
return 2.0 * size * _get_conv_filter_size(op)
else: # MatMul
# A MatMul is like a 1x1 conv with an output size of 1x1, so from the factor
# above only the 2.0 remains.
return 2.0
def num_weights_coeff(op):
"""The number of weights of a conv is C * output_depth * input_depth. Finds C.
Args:
op: A tf.Operation of type 'Conv2D' or 'MatMul'
Returns:
A float, the coefficient that when multiplied by the input depth and by the
output depth gives the number of flops needed to compute the convolution.
Raises:
ValueError: conv_op is not a tf.Operation of type Conv2D.
"""
_raise_if_not_supported(op)
return _get_conv_filter_size(op) if op.type in _CONV2D_OPS else 1.0
class BilinearNetworkRegularizer(generic_regularizers.NetworkRegularizer):
"""A NetworkRegularizer with bilinear cost and loss.
Can be used for FLOPs regularization or for model size regularization.
"""
def __init__(self, opreg_manager, coeff_func):
"""Creates an instance.
Args:
opreg_manager: An OpRegularizerManager object that will be used to query
OpRegularizers of the various ops in the graph.
coeff_func: A callable that receives a tf.Operation of type Conv2D and
returns a bilinear coefficient of its cost. Examples:
- Use conv_flop_coeff for a FLOP regularizer.
- Use conv_num_weights_coeff for a number-of-weights regularizer.
"""
self._opreg_manager = opreg_manager
self._coeff_func = coeff_func
def _get_cost_or_regularization_term(self, is_regularization, ops=None):
total = 0.0
if not ops:
ops = self._opreg_manager.ops
for op in ops:
if op.type not in _SUPPORTED_OPS:
continue
# We use the following expression for thr regularizer:
#
# coeff * (number_of_inputs_alive * sum_of_output_regularizers +
# number_of_outputs_alive * sum_of_input_regularizers)
#
# where 'coeff' is a coefficient (for a particular convolution) such that
# the number of flops of that convolution is given by:
# number_of_flops = coeff * number_of_inputs * number_of_outputs.
input_op = _get_input(op).op
input_op_reg = self._opreg_manager.get_regularizer(input_op)
output_op_reg = self._opreg_manager.get_regularizer(op)
coeff = self._coeff_func(op)
num_alive_inputs = _count_alive(input_op, input_op_reg)
num_alive_outputs = _count_alive(op, output_op_reg)
if op.type == 'DepthwiseConv2dNative':
if is_regularization:
reg_inputs = _sum_of_reg_vector(input_op_reg)
reg_outputs = _sum_of_reg_vector(output_op_reg)
# reg_inputs and reg_outputs are often identical since they should
# come from the same reguarlizer. Duplicate them for symmetry.
# When the input doesn't have a regularizer (e.g. input), only the
# second term is used.
# TODO: revisit this expression after experiments.
total += coeff * (reg_inputs + reg_outputs)
else:
# num_alive_inputs may not always equals num_alive_outputs because the
# input (e.g. the image) may not have a gamma regularizer. In this
# case the computation is porportional only to num_alive_outputs.
total += coeff * num_alive_outputs
else:
if is_regularization:
reg_inputs = _sum_of_reg_vector(input_op_reg)
reg_outputs = _sum_of_reg_vector(output_op_reg)
total += coeff * (
num_alive_inputs * reg_outputs + num_alive_outputs * reg_inputs)
else:
total += coeff * num_alive_inputs * num_alive_outputs
return total
def get_cost(self, ops=None):
return self._get_cost_or_regularization_term(False, ops)
def get_regularization_term(self, ops=None):
return self._get_cost_or_regularization_term(True, ops)
def _get_input(op):
"""Returns the input to that op that represents the activations.
(as opposed to e.g. weights.)
Args:
op: A tf.Operation object with type in _SUPPORTED_OPS.
Returns:
A tf.Tensor representing the input activations.
Raises:
ValueError: MatMul is used with transposition (unsupported).
"""
assert op.type in _SUPPORTED_OPS, 'Op type %s is not supported.' % op.type
if op.type == 'Conv2D' or op.type == 'DepthwiseConv2dNative':
return op.inputs[0]
if op.type == 'Conv2DBackpropInput':
return op.inputs[2]
if op.type == 'MatMul':
if op.get_attr('transpose_a') or op.get_attr('transpose_b'):
raise ValueError('MatMul with transposition is not yet supported.')
return op.inputs[0]
def _count_alive(op, opreg):
if opreg:
return tf.reduce_sum(tf.cast(opreg.alive_vector, tf.float32))
else:
return float(op.outputs[0].shape.as_list()[-1])
def _sum_of_reg_vector(opreg):
if opreg:
return tf.reduce_sum(opreg.regularization_vector)
else:
return 0.0
# Copyright 2018 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for compute_cost_estimator.
Note that BilinearNetworkRegularizer is not tested here - its specific
instantiation is tested in flop_regularizer_test.py.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from tensorflow.python.framework import ops
from morph_net.network_regularizers import bilinear_cost_utils
layers = tf.contrib.layers
def _flops(op):
"""Get the number of flops of a convolution, from the ops stats registry.
Args:
op: A tf.Operation object.
Returns:
The number os flops needed to evaluate conv_op.
"""
return (ops.get_stats_for_node_def(tf.get_default_graph(), op.node_def,
'flops').value)
def _output_depth(conv_op):
return conv_op.outputs[0].shape.as_list()[-1]
def _input_depth(conv_op):
conv_weights = conv_op.inputs[1]
return conv_weights.shape.as_list()[2]
class BilinearCostUtilTest(tf.test.TestCase):
def setUp(self):
tf.reset_default_graph()
image = tf.constant(0.0, shape=[1, 11, 13, 17])
net = layers.conv2d(
image, 19, [7, 5], stride=2, padding='SAME', scope='conv1')
layers.conv2d_transpose(
image, 29, [7, 5], stride=2, padding='SAME', scope='convt2')
net = tf.reduce_mean(net, axis=(1, 2))
layers.fully_connected(net, 23, scope='FC')
net = layers.conv2d(
image, 10, [7, 5], stride=2, padding='SAME', scope='conv2')
layers.separable_conv2d(
net, None, [3, 2], depth_multiplier=1, padding='SAME', scope='dw1')
self.conv_op = tf.get_default_graph().get_operation_by_name('conv1/Conv2D')
self.convt_op = tf.get_default_graph().get_operation_by_name(
'convt2/conv2d_transpose')
self.matmul_op = tf.get_default_graph().get_operation_by_name(
'FC/MatMul')
self.dw_op = tf.get_default_graph().get_operation_by_name(
'dw1/depthwise')
def assertNearRelatively(self, expected, actual):
self.assertNear(expected, actual, expected * 1e-6)
def testConvFlopsCoeff(self):
# Divide by the input depth and the output depth to get the coefficient.
expected_coeff = _flops(self.conv_op) / (17.0 * 19.0)
actual_coeff = bilinear_cost_utils.flop_coeff(self.conv_op)
self.assertNearRelatively(expected_coeff, actual_coeff)
def testConvTransposeFlopsCoeff(self):
# Divide by the input depth and the output depth to get the coefficient.
expected_coeff = _flops(self.convt_op) / (17.0 * 29.0)
actual_coeff = bilinear_cost_utils.flop_coeff(self.convt_op)
self.assertNearRelatively(expected_coeff, actual_coeff)
def testFcFlopsCoeff(self):
expected_coeff = _flops(self.matmul_op) / (19.0 * 23.0)
actual_coeff = bilinear_cost_utils.flop_coeff(self.matmul_op)
self.assertNearRelatively(expected_coeff, actual_coeff)
def testConvNumWeightsCoeff(self):
actual_coeff = bilinear_cost_utils.num_weights_coeff(self.conv_op)
# The coefficient is just the filter size - 7 * 5 = 35:
self.assertNearRelatively(35, actual_coeff)
def testFcNumWeightsCoeff(self):
actual_coeff = bilinear_cost_utils.num_weights_coeff(self.matmul_op)
# The coefficient is 1.0, the number of weights is just inputs x outputs.
self.assertNearRelatively(1.0, actual_coeff)
def testDepthwiseConvFlopsCoeff(self):
# Divide by the input depth (which is also the output depth) to get the
# coefficient.
expected_coeff = _flops(self.dw_op) / (10.0)
actual_coeff = bilinear_cost_utils.flop_coeff(self.dw_op)
self.assertNearRelatively(expected_coeff, actual_coeff)
if __name__ == '__main__':
tf.test.main()
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment