Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
6c6f3f3a
Commit
6c6f3f3a
authored
Jun 13, 2018
by
Alan Mackey
Browse files
Added new model, global objectives.
parent
cac3a298
Changes
8
Hide whitespace changes
Inline
Side-by-side
Showing
8 changed files
with
3387 additions
and
0 deletions
+3387
-0
CODEOWNERS
CODEOWNERS
+1
-0
research/global_objectives/README.md
research/global_objectives/README.md
+148
-0
research/global_objectives/loss_layers.py
research/global_objectives/loss_layers.py
+930
-0
research/global_objectives/loss_layers_example.py
research/global_objectives/loss_layers_example.py
+211
-0
research/global_objectives/loss_layers_test.py
research/global_objectives/loss_layers_test.py
+1379
-0
research/global_objectives/test_all.py
research/global_objectives/test_all.py
+37
-0
research/global_objectives/util.py
research/global_objectives/util.py
+348
-0
research/global_objectives/util_test.py
research/global_objectives/util_test.py
+333
-0
No files found.
CODEOWNERS
View file @
6c6f3f3a
...
...
@@ -14,6 +14,7 @@
/research/differential_privacy/ @ilyamironov @ananthr
/research/domain_adaptation/ @bousmalis @dmrd
/research/gan/ @joel-shor
/research/global_objectives/ @mackeya-google
/research/im2txt/ @cshallue
/research/inception/ @shlens @vincentvanhoucke
/research/learned_optimizer/ @olganw @nirum
...
...
research/global_objectives/README.md
0 → 100644
View file @
6c6f3f3a
# Global Objectives
The Global Objectives library provides TensorFlow loss functions that optimize
directly for a variety of objectives including AUC, recall at precision, and
more. The global objectives losses can be used as drop-in replacements for
TensorFlow's standard multilabel loss functions:
`tf.nn.sigmoid_cross_entropy_with_logits`
and
`tf.losses.sigmoid_cross_entropy`
.
Many machine learning classification models are optimized for classification
accuracy, when the real objective the user cares about is different and can be
precision at a fixed recall, precision-recall AUC, ROC AUC or similar metrics.
These are referred to as "global objectives" because they depend on how the
model classifies the dataset as a whole and do not decouple across data points
as accuracy does.
Because these objectives are combinatorial, discontinuous, and essentially
intractable to optimize directly, the functions in this library approximate
their corresponding objectives. This approximation approach follows the same
pattern as optimizing for accuracy, where a surrogate objective such as
cross-entropy or the hinge loss is used as an upper bound on the error rate.
## Getting Started
For a full example of how to use the loss functions in practice, see
loss_layers_example.py.
Briefly, global objective losses can be used to replace
`tf.nn.sigmoid_cross_entropy_with_logits`
by providing the relevant
additional arguments. For example,
```
python
tf
.
nn
.
sigmoid_cross_entropy_with_logits
(
labels
=
labels
,
logits
=
logits
)
```
could be replaced with
```
python
global_objectives
.
recall_at_precision_loss
(
labels
=
labels
,
logits
=
logits
,
target_precision
=
0.95
)[
0
]
```
Just as minimizing the cross-entropy loss will maximize accuracy, the loss
functions in loss_layers.py were written so that minimizing the loss will
maximize the corresponding objective.
The global objective losses have two return values -- the loss tensor and
additional quantities for debugging and customization -- which is why the first
value is used above. For more information, see
[
Visualization & Debugging
](
#visualization-debugging
)
.
## Binary Label Format
Binary classification problems can be represented as a multi-class problem with
two classes, or as a multi-label problem with one label. (Recall that multiclass
problems have mutually exclusive classes, e.g. 'cat xor dog', and multilabel
have classes which are not mutually exclusive, e.g. an image can contain a cat,
a dog, both, or neither.) The softmax loss
(
`tf.nn.softmax_cross_entropy_with_logits`
) is used for multi-class problems,
while the sigmoid loss (
`tf.nn.sigmoid_cross_entropy_with_logits`
) is used for
multi-label problems.
A multiclass label format for binary classification might represent positives
with the label [1, 0] and negatives with the label [0, 1], while the multilbel
format for the same problem would use [1] and [0], respectively.
All global objectives loss functions assume that the multilabel format is used.
Accordingly, if your current loss function is softmax, the labels will have to
be reformatted for the loss to work properly.
## Dual Variables
Global objectives losses (except for
`roc_auc_loss`
) use internal variables
called dual variables or Lagrange multipliers to enforce the desired constraint
(e.g. if optimzing for recall at precision, the constraint is on precision).
These dual variables are created and initialized internally by the loss
functions, and are updated during training by the same optimizer used for the
model's other variables. To initialize the dual variables to a particular value,
use the
`lambdas_initializer`
argument. The dual variables can be found under
the key
`lambdas`
in the
`other_outputs`
dictionary returned by the losses.
## Loss Function Arguments
The following arguments are common to all loss functions in the library, and are
either required or very important.
*
`labels`
: Corresponds directly to the
`labels`
argument of
`tf.nn.sigmoid_cross_entropy_with_logits`
.
*
`logits`
: Corresponds directly to the
`logits`
argument of
`tf.nn.sigmoid_cross_entropy_with_logits`
.
*
`dual_rate_factor`
: A floating point value which controls the step size for
the Lagrange multipliers. Setting this value less than 1.0 will cause the
constraint to be enforced more gradually and will result in more stable
training.
In addition, the objectives with a single constraint (e.g.
`recall_at_precision_loss`
) have an argument (e.g.
`target_precision`
) used to
specify the value of the constraint. The optional
`precision_range`
argument to
`precision_recall_auc_loss`
is used to specify the range of precision values
over which to optimize the AUC, and defaults to the interval [0, 1].
Optional arguments:
*
`weights`
: A tensor which acts as coefficients for the loss. If a weight of x
is provided for a datapoint and that datapoint is a true (false) positive
(negative), it will be counted as x true (false) positives (negatives).
Defaults to 1.0.
*
`label_priors`
: A tensor specifying the fraction of positive datapoints for
each label. If not provided, it will be computed inside the loss function.
*
`surrogate_type`
: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions.
*
`lambdas_initializer`
: An initializer for the dual variables (Lagrange
multipliers). See also the Dual Variables section.
*
`num_anchors`
(precision_recall_auc_loss only): The number of grid points used
when approximating the AUC as a Riemann sum.
## Hyperparameters
While the functional form of the global objectives losses allow them to be
easily substituted in place of
`sigmoid_cross_entropy_with_logits`
, model
hyperparameters such as learning rate, weight decay, etc. may need to be
fine-tuned to the new loss. Fortunately, the amount of hyperparameter re-tuning
is usually minor.
The most important hyperparameters to modify are the learning rate and
dual_rate_factor (see the section on Loss Function Arguments, above).
## Visualization & Debugging
The global objectives losses return two values. The first is a tensor
representing the numerical value of the loss, which can be passed to an
optimizer. The second is a dictionary of tensors created by the loss function
which are not necessary for optimization but useful in debugging. These vary
depending on the loss function, but usually include
`lambdas`
(the Lagrange
multipliers) as well as the lower bound on true positives and upper bound on
false positives.
When visualizing the loss during training, note that the global objectives
losses differ from standard losses in some important ways:
*
The global losses may be negative. This is because the value returned by the
loss includes terms involving the Lagrange multipliers, which may be negative.
*
The global losses may not decrease over the course of training. To enforce the
constraints in the objective, the loss changes over time and may increase.
## More Info
For more details, see the
[
Global Objectives paper
](
https://arxiv.org/abs/1608.04802
)
.
## Maintainers
*
Mariano Schain
*
Elad Eban
*
[
Alan Mackey
](
https://github.com/mackeya-google
)
research/global_objectives/loss_layers.py
0 → 100644
View file @
6c6f3f3a
# Copyright 2018 The TensorFlow Global Objectives Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Loss functions for learning global objectives.
These functions have two return values: a Tensor with the value of
the loss, and a dictionary of internal quantities for customizability.
"""
# Dependency imports
import
numpy
import
tensorflow
as
tf
from
global_objectives
import
util
def
precision_recall_auc_loss
(
labels
,
logits
,
precision_range
=
(
0.0
,
1.0
),
num_anchors
=
20
,
weights
=
1.0
,
dual_rate_factor
=
0.1
,
label_priors
=
None
,
surrogate_type
=
'xent'
,
lambdas_initializer
=
tf
.
constant_initializer
(
1.0
),
reuse
=
None
,
variables_collections
=
None
,
trainable
=
True
,
scope
=
None
):
"""Computes precision-recall AUC loss.
The loss is based on a sum of losses for recall at a range of
precision values (anchor points). This sum is a Riemann sum that
approximates the area under the precision-recall curve.
The per-example `weights` argument changes not only the coefficients of
individual training examples, but how the examples are counted toward the
constraint. If `label_priors` is given, it MUST take `weights` into account.
That is,
label_priors = P / (P + N)
where
P = sum_i (wt_i on positives)
N = sum_i (wt_i on negatives).
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` with the same shape as `labels`.
precision_range: A length-two tuple, the range of precision values over
which to compute AUC. The entries must be nonnegative, increasing, and
less than or equal to 1.0.
num_anchors: The number of grid points used to approximate the Riemann sum.
weights: Coefficients for the loss. Must be a scalar or `Tensor` of shape
[batch_size] or [batch_size, num_labels].
dual_rate_factor: A floating point value which controls the step size for
the Lagrange multipliers.
label_priors: None, or a floating point `Tensor` of shape [num_labels]
containing the prior probability of each label (i.e. the fraction of the
training data consisting of positive examples). If None, the label
priors are computed from `labels` with a moving average. See the notes
above regarding the interaction with `weights` and do not set this unless
you have a good reason to do so.
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions.
lambdas_initializer: An initializer for the Lagrange multipliers.
reuse: Whether or not the layer and its variables should be reused. To be
able to reuse the layer scope must be given.
variables_collections: Optional list of collections for the variables.
trainable: If `True` also add variables to the graph collection
`GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).
scope: Optional scope for `variable_scope`.
Returns:
loss: A `Tensor` of the same shape as `logits` with the component-wise
loss.
other_outputs: A dictionary of useful internal quantities for debugging. For
more details, see http://arxiv.org/pdf/1608.04802.pdf.
lambdas: A Tensor of shape [1, num_labels, num_anchors] consisting of the
Lagrange multipliers.
biases: A Tensor of shape [1, num_labels, num_anchors] consisting of the
learned bias term for each.
label_priors: A Tensor of shape [1, num_labels, 1] consisting of the prior
probability of each label learned by the loss, if not provided.
true_positives_lower_bound: Lower bound on the number of true positives
given `labels` and `logits`. This is the same lower bound which is used
in the loss expression to be optimized.
false_positives_upper_bound: Upper bound on the number of false positives
given `labels` and `logits`. This is the same upper bound which is used
in the loss expression to be optimized.
Raises:
ValueError: If `surrogate_type` is not `xent` or `hinge`.
"""
with
tf
.
variable_scope
(
scope
,
'precision_recall_auc'
,
[
labels
,
logits
,
label_priors
],
reuse
=
reuse
):
labels
,
logits
,
weights
,
original_shape
=
_prepare_labels_logits_weights
(
labels
,
logits
,
weights
)
num_labels
=
util
.
get_num_labels
(
logits
)
# Convert other inputs to tensors and standardize dtypes.
dual_rate_factor
=
util
.
convert_and_cast
(
dual_rate_factor
,
'dual_rate_factor'
,
logits
.
dtype
)
# Create Tensor of anchor points and distance between anchors.
precision_values
,
delta
=
_range_to_anchors_and_delta
(
precision_range
,
num_anchors
,
logits
.
dtype
)
# Create lambdas with shape [1, num_labels, num_anchors].
lambdas
,
lambdas_variable
=
_create_dual_variable
(
'lambdas'
,
shape
=
[
1
,
num_labels
,
num_anchors
],
dtype
=
logits
.
dtype
,
initializer
=
lambdas_initializer
,
collections
=
variables_collections
,
trainable
=
trainable
,
dual_rate_factor
=
dual_rate_factor
)
# Create biases with shape [1, num_labels, num_anchors].
biases
=
tf
.
contrib
.
framework
.
model_variable
(
name
=
'biases'
,
shape
=
[
1
,
num_labels
,
num_anchors
],
dtype
=
logits
.
dtype
,
initializer
=
tf
.
zeros_initializer
(),
collections
=
variables_collections
,
trainable
=
trainable
)
# Maybe create label_priors.
label_priors
=
maybe_create_label_priors
(
label_priors
,
labels
,
weights
,
variables_collections
)
label_priors
=
tf
.
reshape
(
label_priors
,
[
1
,
num_labels
,
1
])
# Expand logits, labels, and weights to shape [batch_size, num_labels, 1].
logits
=
tf
.
expand_dims
(
logits
,
2
)
labels
=
tf
.
expand_dims
(
labels
,
2
)
weights
=
tf
.
expand_dims
(
weights
,
2
)
# Calculate weighted loss and other outputs. The log(2.0) term corrects for
# logloss not being an upper bound on the indicator function.
loss
=
weights
*
util
.
weighted_surrogate_loss
(
labels
,
logits
+
biases
,
surrogate_type
=
surrogate_type
,
positive_weights
=
1.0
+
lambdas
*
(
1.0
-
precision_values
),
negative_weights
=
lambdas
*
precision_values
)
maybe_log2
=
tf
.
log
(
2.0
)
if
surrogate_type
==
'xent'
else
1.0
maybe_log2
=
tf
.
cast
(
maybe_log2
,
logits
.
dtype
.
base_dtype
)
lambda_term
=
lambdas
*
(
1.0
-
precision_values
)
*
label_priors
*
maybe_log2
per_anchor_loss
=
loss
-
lambda_term
per_label_loss
=
delta
*
tf
.
reduce_sum
(
per_anchor_loss
,
2
)
# Normalize the AUC such that a perfect score function will have AUC 1.0.
# Because precision_range is discretized into num_anchors + 1 intervals
# but only num_anchors terms are included in the Riemann sum, the
# effective length of the integration interval is `delta` less than the
# length of precision_range.
scaled_loss
=
tf
.
div
(
per_label_loss
,
precision_range
[
1
]
-
precision_range
[
0
]
-
delta
,
name
=
'AUC_Normalize'
)
scaled_loss
=
tf
.
reshape
(
scaled_loss
,
original_shape
)
other_outputs
=
{
'lambdas'
:
lambdas_variable
,
'biases'
:
biases
,
'label_priors'
:
label_priors
,
'true_positives_lower_bound'
:
true_positives_lower_bound
(
labels
,
logits
,
weights
,
surrogate_type
),
'false_positives_upper_bound'
:
false_positives_upper_bound
(
labels
,
logits
,
weights
,
surrogate_type
)}
return
scaled_loss
,
other_outputs
def
roc_auc_loss
(
labels
,
logits
,
weights
=
1.0
,
surrogate_type
=
'xent'
,
scope
=
None
):
"""Computes ROC AUC loss.
The area under the ROC curve is the probability p that a randomly chosen
positive example will be scored higher than a randomly chosen negative
example. This loss approximates 1-p by using a surrogate (either hinge loss or
cross entropy) for the indicator function. Specifically, the loss is:
sum_i sum_j w_i*w_j*loss(logit_i - logit_j)
where i ranges over the positive datapoints, j ranges over the negative
datapoints, logit_k denotes the logit (or score) of the k-th datapoint, and
loss is either the hinge or log loss given a positive label.
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` with the same shape and dtype as `labels`.
weights: Coefficients for the loss. Must be a scalar or `Tensor` of shape
[batch_size] or [batch_size, num_labels].
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for the indicator function.
scope: Optional scope for `name_scope`.
Returns:
loss: A `Tensor` of the same shape as `logits` with the component-wise loss.
other_outputs: An empty dictionary, for consistency.
Raises:
ValueError: If `surrogate_type` is not `xent` or `hinge`.
"""
with
tf
.
name_scope
(
scope
,
'roc_auc'
,
[
labels
,
logits
,
weights
]):
# Convert inputs to tensors and standardize dtypes.
labels
,
logits
,
weights
,
original_shape
=
_prepare_labels_logits_weights
(
labels
,
logits
,
weights
)
# Create tensors of pairwise differences for logits and labels, and
# pairwise products of weights. These have shape
# [batch_size, batch_size, num_labels].
logits_difference
=
tf
.
expand_dims
(
logits
,
0
)
-
tf
.
expand_dims
(
logits
,
1
)
labels_difference
=
tf
.
expand_dims
(
labels
,
0
)
-
tf
.
expand_dims
(
labels
,
1
)
weights_product
=
tf
.
expand_dims
(
weights
,
0
)
*
tf
.
expand_dims
(
weights
,
1
)
signed_logits_difference
=
labels_difference
*
logits_difference
raw_loss
=
util
.
weighted_surrogate_loss
(
labels
=
tf
.
ones_like
(
signed_logits_difference
),
logits
=
signed_logits_difference
,
surrogate_type
=
surrogate_type
)
weighted_loss
=
weights_product
*
raw_loss
# Zero out entries of the loss where labels_difference zero (so loss is only
# computed on pairs with different labels).
loss
=
tf
.
reduce_mean
(
tf
.
abs
(
labels_difference
)
*
weighted_loss
,
0
)
*
0.5
loss
=
tf
.
reshape
(
loss
,
original_shape
)
return
loss
,
{}
def
recall_at_precision_loss
(
labels
,
logits
,
target_precision
,
weights
=
1.0
,
dual_rate_factor
=
0.1
,
label_priors
=
None
,
surrogate_type
=
'xent'
,
lambdas_initializer
=
tf
.
constant_initializer
(
1.0
),
reuse
=
None
,
variables_collections
=
None
,
trainable
=
True
,
scope
=
None
):
"""Computes recall at precision loss.
The loss is based on a surrogate of the form
wt * w(+) * loss(+) + wt * w(-) * loss(-) - c * pi,
where:
- w(+) = 1 + lambdas * (1 - target_precision)
- loss(+) is the cross-entropy loss on the positive examples
- w(-) = lambdas * target_precision
- loss(-) is the cross-entropy loss on the negative examples
- wt is a scalar or tensor of per-example weights
- c = lambdas * (1 - target_precision)
- pi is the label_priors.
The per-example weights change not only the coefficients of individual
training examples, but how the examples are counted toward the constraint.
If `label_priors` is given, it MUST take `weights` into account. That is,
label_priors = P / (P + N)
where
P = sum_i (wt_i on positives)
N = sum_i (wt_i on negatives).
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` with the same shape as `labels`.
target_precision: The precision at which to compute the loss. Can be a
floating point value between 0 and 1 for a single precision value, or a
`Tensor` of shape [num_labels], holding each label's target precision
value.
weights: Coefficients for the loss. Must be a scalar or `Tensor` of shape
[batch_size] or [batch_size, num_labels].
dual_rate_factor: A floating point value which controls the step size for
the Lagrange multipliers.
label_priors: None, or a floating point `Tensor` of shape [num_labels]
containing the prior probability of each label (i.e. the fraction of the
training data consisting of positive examples). If None, the label
priors are computed from `labels` with a moving average. See the notes
above regarding the interaction with `weights` and do not set this unless
you have a good reason to do so.
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions.
lambdas_initializer: An initializer for the Lagrange multipliers.
reuse: Whether or not the layer and its variables should be reused. To be
able to reuse the layer scope must be given.
variables_collections: Optional list of collections for the variables.
trainable: If `True` also add variables to the graph collection
`GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).
scope: Optional scope for `variable_scope`.
Returns:
loss: A `Tensor` of the same shape as `logits` with the component-wise
loss.
other_outputs: A dictionary of useful internal quantities for debugging. For
more details, see http://arxiv.org/pdf/1608.04802.pdf.
lambdas: A Tensor of shape [num_labels] consisting of the Lagrange
multipliers.
label_priors: A Tensor of shape [num_labels] consisting of the prior
probability of each label learned by the loss, if not provided.
true_positives_lower_bound: Lower bound on the number of true positives
given `labels` and `logits`. This is the same lower bound which is used
in the loss expression to be optimized.
false_positives_upper_bound: Upper bound on the number of false positives
given `labels` and `logits`. This is the same upper bound which is used
in the loss expression to be optimized.
Raises:
ValueError: If `logits` and `labels` do not have the same shape.
"""
with
tf
.
variable_scope
(
scope
,
'recall_at_precision'
,
[
logits
,
labels
,
label_priors
],
reuse
=
reuse
):
labels
,
logits
,
weights
,
original_shape
=
_prepare_labels_logits_weights
(
labels
,
logits
,
weights
)
num_labels
=
util
.
get_num_labels
(
logits
)
# Convert other inputs to tensors and standardize dtypes.
target_precision
=
util
.
convert_and_cast
(
target_precision
,
'target_precision'
,
logits
.
dtype
)
dual_rate_factor
=
util
.
convert_and_cast
(
dual_rate_factor
,
'dual_rate_factor'
,
logits
.
dtype
)
# Create lambdas.
lambdas
,
lambdas_variable
=
_create_dual_variable
(
'lambdas'
,
shape
=
[
num_labels
],
dtype
=
logits
.
dtype
,
initializer
=
lambdas_initializer
,
collections
=
variables_collections
,
trainable
=
trainable
,
dual_rate_factor
=
dual_rate_factor
)
# Maybe create label_priors.
label_priors
=
maybe_create_label_priors
(
label_priors
,
labels
,
weights
,
variables_collections
)
# Calculate weighted loss and other outputs. The log(2.0) term corrects for
# logloss not being an upper bound on the indicator function.
weighted_loss
=
weights
*
util
.
weighted_surrogate_loss
(
labels
,
logits
,
surrogate_type
=
surrogate_type
,
positive_weights
=
1.0
+
lambdas
*
(
1.0
-
target_precision
),
negative_weights
=
lambdas
*
target_precision
)
maybe_log2
=
tf
.
log
(
2.0
)
if
surrogate_type
==
'xent'
else
1.0
maybe_log2
=
tf
.
cast
(
maybe_log2
,
logits
.
dtype
.
base_dtype
)
lambda_term
=
lambdas
*
(
1.0
-
target_precision
)
*
label_priors
*
maybe_log2
loss
=
tf
.
reshape
(
weighted_loss
-
lambda_term
,
original_shape
)
other_outputs
=
{
'lambdas'
:
lambdas_variable
,
'label_priors'
:
label_priors
,
'true_positives_lower_bound'
:
true_positives_lower_bound
(
labels
,
logits
,
weights
,
surrogate_type
),
'false_positives_upper_bound'
:
false_positives_upper_bound
(
labels
,
logits
,
weights
,
surrogate_type
)}
return
loss
,
other_outputs
def
precision_at_recall_loss
(
labels
,
logits
,
target_recall
,
weights
=
1.0
,
dual_rate_factor
=
0.1
,
label_priors
=
None
,
surrogate_type
=
'xent'
,
lambdas_initializer
=
tf
.
constant_initializer
(
1.0
),
reuse
=
None
,
variables_collections
=
None
,
trainable
=
True
,
scope
=
None
):
"""Computes precision at recall loss.
The loss is based on a surrogate of the form
wt * loss(-) + lambdas * (pi * (b - 1) + wt * loss(+))
where:
- loss(-) is the cross-entropy loss on the negative examples
- loss(+) is the cross-entropy loss on the positive examples
- wt is a scalar or tensor of per-example weights
- b is the target recall
- pi is the label_priors.
The per-example weights change not only the coefficients of individual
training examples, but how the examples are counted toward the constraint.
If `label_priors` is given, it MUST take `weights` into account. That is,
label_priors = P / (P + N)
where
P = sum_i (wt_i on positives)
N = sum_i (wt_i on negatives).
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` with the same shape as `labels`.
target_recall: The recall at which to compute the loss. Can be a floating
point value between 0 and 1 for a single target recall value, or a
`Tensor` of shape [num_labels] holding each label's target recall value.
weights: Coefficients for the loss. Must be a scalar or `Tensor` of shape
[batch_size] or [batch_size, num_labels].
dual_rate_factor: A floating point value which controls the step size for
the Lagrange multipliers.
label_priors: None, or a floating point `Tensor` of shape [num_labels]
containing the prior probability of each label (i.e. the fraction of the
training data consisting of positive examples). If None, the label
priors are computed from `labels` with a moving average. See the notes
above regarding the interaction with `weights` and do not set this unless
you have a good reason to do so.
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions.
lambdas_initializer: An initializer for the Lagrange multipliers.
reuse: Whether or not the layer and its variables should be reused. To be
able to reuse the layer scope must be given.
variables_collections: Optional list of collections for the variables.
trainable: If `True` also add variables to the graph collection
`GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).
scope: Optional scope for `variable_scope`.
Returns:
loss: A `Tensor` of the same shape as `logits` with the component-wise
loss.
other_outputs: A dictionary of useful internal quantities for debugging. For
more details, see http://arxiv.org/pdf/1608.04802.pdf.
lambdas: A Tensor of shape [num_labels] consisting of the Lagrange
multipliers.
label_priors: A Tensor of shape [num_labels] consisting of the prior
probability of each label learned by the loss, if not provided.
true_positives_lower_bound: Lower bound on the number of true positives
given `labels` and `logits`. This is the same lower bound which is used
in the loss expression to be optimized.
false_positives_upper_bound: Upper bound on the number of false positives
given `labels` and `logits`. This is the same upper bound which is used
in the loss expression to be optimized.
"""
with
tf
.
variable_scope
(
scope
,
'precision_at_recall'
,
[
logits
,
labels
,
label_priors
],
reuse
=
reuse
):
labels
,
logits
,
weights
,
original_shape
=
_prepare_labels_logits_weights
(
labels
,
logits
,
weights
)
num_labels
=
util
.
get_num_labels
(
logits
)
# Convert other inputs to tensors and standardize dtypes.
target_recall
=
util
.
convert_and_cast
(
target_recall
,
'target_recall'
,
logits
.
dtype
)
dual_rate_factor
=
util
.
convert_and_cast
(
dual_rate_factor
,
'dual_rate_factor'
,
logits
.
dtype
)
# Create lambdas.
lambdas
,
lambdas_variable
=
_create_dual_variable
(
'lambdas'
,
shape
=
[
num_labels
],
dtype
=
logits
.
dtype
,
initializer
=
lambdas_initializer
,
collections
=
variables_collections
,
trainable
=
trainable
,
dual_rate_factor
=
dual_rate_factor
)
# Maybe create label_priors.
label_priors
=
maybe_create_label_priors
(
label_priors
,
labels
,
weights
,
variables_collections
)
# Calculate weighted loss and other outputs. The log(2.0) term corrects for
# logloss not being an upper bound on the indicator function.
weighted_loss
=
weights
*
util
.
weighted_surrogate_loss
(
labels
,
logits
,
surrogate_type
,
positive_weights
=
lambdas
,
negative_weights
=
1.0
)
maybe_log2
=
tf
.
log
(
2.0
)
if
surrogate_type
==
'xent'
else
1.0
maybe_log2
=
tf
.
cast
(
maybe_log2
,
logits
.
dtype
.
base_dtype
)
lambda_term
=
lambdas
*
label_priors
*
(
target_recall
-
1.0
)
*
maybe_log2
loss
=
tf
.
reshape
(
weighted_loss
+
lambda_term
,
original_shape
)
other_outputs
=
{
'lambdas'
:
lambdas_variable
,
'label_priors'
:
label_priors
,
'true_positives_lower_bound'
:
true_positives_lower_bound
(
labels
,
logits
,
weights
,
surrogate_type
),
'false_positives_upper_bound'
:
false_positives_upper_bound
(
labels
,
logits
,
weights
,
surrogate_type
)}
return
loss
,
other_outputs
def
false_positive_rate_at_true_positive_rate_loss
(
labels
,
logits
,
target_rate
,
weights
=
1.0
,
dual_rate_factor
=
0.1
,
label_priors
=
None
,
surrogate_type
=
'xent'
,
lambdas_initializer
=
tf
.
constant_initializer
(
1.0
),
reuse
=
None
,
variables_collections
=
None
,
trainable
=
True
,
scope
=
None
):
"""Computes false positive rate at true positive rate loss.
Note that `true positive rate` is a synonym for Recall, and that minimizing
the false positive rate and maximizing precision are equivalent for a fixed
Recall. Therefore, this function is identical to precision_at_recall_loss.
The per-example weights change not only the coefficients of individual
training examples, but how the examples are counted toward the constraint.
If `label_priors` is given, it MUST take `weights` into account. That is,
label_priors = P / (P + N)
where
P = sum_i (wt_i on positives)
N = sum_i (wt_i on negatives).
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` with the same shape as `labels`.
target_rate: The true positive rate at which to compute the loss. Can be a
floating point value between 0 and 1 for a single true positive rate, or
a `Tensor` of shape [num_labels] holding each label's true positive rate.
weights: Coefficients for the loss. Must be a scalar or `Tensor` of shape
[batch_size] or [batch_size, num_labels].
dual_rate_factor: A floating point value which controls the step size for
the Lagrange multipliers.
label_priors: None, or a floating point `Tensor` of shape [num_labels]
containing the prior probability of each label (i.e. the fraction of the
training data consisting of positive examples). If None, the label
priors are computed from `labels` with a moving average. See the notes
above regarding the interaction with `weights` and do not set this unless
you have a good reason to do so.
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions. 'xent' will use the cross-entropy
loss surrogate, and 'hinge' will use the hinge loss.
lambdas_initializer: An initializer op for the Lagrange multipliers.
reuse: Whether or not the layer and its variables should be reused. To be
able to reuse the layer scope must be given.
variables_collections: Optional list of collections for the variables.
trainable: If `True` also add variables to the graph collection
`GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).
scope: Optional scope for `variable_scope`.
Returns:
loss: A `Tensor` of the same shape as `logits` with the component-wise
loss.
other_outputs: A dictionary of useful internal quantities for debugging. For
more details, see http://arxiv.org/pdf/1608.04802.pdf.
lambdas: A Tensor of shape [num_labels] consisting of the Lagrange
multipliers.
label_priors: A Tensor of shape [num_labels] consisting of the prior
probability of each label learned by the loss, if not provided.
true_positives_lower_bound: Lower bound on the number of true positives
given `labels` and `logits`. This is the same lower bound which is used
in the loss expression to be optimized.
false_positives_upper_bound: Upper bound on the number of false positives
given `labels` and `logits`. This is the same upper bound which is used
in the loss expression to be optimized.
Raises:
ValueError: If `surrogate_type` is not `xent` or `hinge`.
"""
return
precision_at_recall_loss
(
labels
=
labels
,
logits
=
logits
,
target_recall
=
target_rate
,
weights
=
weights
,
dual_rate_factor
=
dual_rate_factor
,
label_priors
=
label_priors
,
surrogate_type
=
surrogate_type
,
lambdas_initializer
=
lambdas_initializer
,
reuse
=
reuse
,
variables_collections
=
variables_collections
,
trainable
=
trainable
,
scope
=
scope
)
def
true_positive_rate_at_false_positive_rate_loss
(
labels
,
logits
,
target_rate
,
weights
=
1.0
,
dual_rate_factor
=
0.1
,
label_priors
=
None
,
surrogate_type
=
'xent'
,
lambdas_initializer
=
tf
.
constant_initializer
(
1.0
),
reuse
=
None
,
variables_collections
=
None
,
trainable
=
True
,
scope
=
None
):
"""Computes true positive rate at false positive rate loss.
The loss is based on a surrogate of the form
wt * loss(+) + lambdas * (wt * loss(-) - r * (1 - pi))
where:
- loss(-) is the loss on the negative examples
- loss(+) is the loss on the positive examples
- wt is a scalar or tensor of per-example weights
- r is the target rate
- pi is the label_priors.
The per-example weights change not only the coefficients of individual
training examples, but how the examples are counted toward the constraint.
If `label_priors` is given, it MUST take `weights` into account. That is,
label_priors = P / (P + N)
where
P = sum_i (wt_i on positives)
N = sum_i (wt_i on negatives).
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` with the same shape as `labels`.
target_rate: The false positive rate at which to compute the loss. Can be a
floating point value between 0 and 1 for a single false positive rate, or
a `Tensor` of shape [num_labels] holding each label's false positive rate.
weights: Coefficients for the loss. Must be a scalar or `Tensor` of shape
[batch_size] or [batch_size, num_labels].
dual_rate_factor: A floating point value which controls the step size for
the Lagrange multipliers.
label_priors: None, or a floating point `Tensor` of shape [num_labels]
containing the prior probability of each label (i.e. the fraction of the
training data consisting of positive examples). If None, the label
priors are computed from `labels` with a moving average. See the notes
above regarding the interaction with `weights` and do not set this unless
you have a good reason to do so.
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions. 'xent' will use the cross-entropy
loss surrogate, and 'hinge' will use the hinge loss.
lambdas_initializer: An initializer op for the Lagrange multipliers.
reuse: Whether or not the layer and its variables should be reused. To be
able to reuse the layer scope must be given.
variables_collections: Optional list of collections for the variables.
trainable: If `True` also add variables to the graph collection
`GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).
scope: Optional scope for `variable_scope`.
Returns:
loss: A `Tensor` of the same shape as `logits` with the component-wise
loss.
other_outputs: A dictionary of useful internal quantities for debugging. For
more details, see http://arxiv.org/pdf/1608.04802.pdf.
lambdas: A Tensor of shape [num_labels] consisting of the Lagrange
multipliers.
label_priors: A Tensor of shape [num_labels] consisting of the prior
probability of each label learned by the loss, if not provided.
true_positives_lower_bound: Lower bound on the number of true positives
given `labels` and `logits`. This is the same lower bound which is used
in the loss expression to be optimized.
false_positives_upper_bound: Upper bound on the number of false positives
given `labels` and `logits`. This is the same upper bound which is used
in the loss expression to be optimized.
Raises:
ValueError: If `surrogate_type` is not `xent` or `hinge`.
"""
with
tf
.
variable_scope
(
scope
,
'tpr_at_fpr'
,
[
labels
,
logits
,
label_priors
],
reuse
=
reuse
):
labels
,
logits
,
weights
,
original_shape
=
_prepare_labels_logits_weights
(
labels
,
logits
,
weights
)
num_labels
=
util
.
get_num_labels
(
logits
)
# Convert other inputs to tensors and standardize dtypes.
target_rate
=
util
.
convert_and_cast
(
target_rate
,
'target_rate'
,
logits
.
dtype
)
dual_rate_factor
=
util
.
convert_and_cast
(
dual_rate_factor
,
'dual_rate_factor'
,
logits
.
dtype
)
# Create lambdas.
lambdas
,
lambdas_variable
=
_create_dual_variable
(
'lambdas'
,
shape
=
[
num_labels
],
dtype
=
logits
.
dtype
,
initializer
=
lambdas_initializer
,
collections
=
variables_collections
,
trainable
=
trainable
,
dual_rate_factor
=
dual_rate_factor
)
# Maybe create label_priors.
label_priors
=
maybe_create_label_priors
(
label_priors
,
labels
,
weights
,
variables_collections
)
# Loss op and other outputs. The log(2.0) term corrects for
# logloss not being an upper bound on the indicator function.
weighted_loss
=
weights
*
util
.
weighted_surrogate_loss
(
labels
,
logits
,
surrogate_type
=
surrogate_type
,
positive_weights
=
1.0
,
negative_weights
=
lambdas
)
maybe_log2
=
tf
.
log
(
2.0
)
if
surrogate_type
==
'xent'
else
1.0
maybe_log2
=
tf
.
cast
(
maybe_log2
,
logits
.
dtype
.
base_dtype
)
lambda_term
=
lambdas
*
target_rate
*
(
1.0
-
label_priors
)
*
maybe_log2
loss
=
tf
.
reshape
(
weighted_loss
-
lambda_term
,
original_shape
)
other_outputs
=
{
'lambdas'
:
lambdas_variable
,
'label_priors'
:
label_priors
,
'true_positives_lower_bound'
:
true_positives_lower_bound
(
labels
,
logits
,
weights
,
surrogate_type
),
'false_positives_upper_bound'
:
false_positives_upper_bound
(
labels
,
logits
,
weights
,
surrogate_type
)}
return
loss
,
other_outputs
def
_prepare_labels_logits_weights
(
labels
,
logits
,
weights
):
"""Validates labels, logits, and weights.
Converts inputs to tensors, checks shape compatibility, and casts dtype if
necessary.
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` with the same shape as `labels`.
weights: Either `None` or a `Tensor` with shape broadcastable to `logits`.
Returns:
labels: Same as `labels` arg after possible conversion to tensor, cast, and
reshape.
logits: Same as `logits` arg after possible conversion to tensor and
reshape.
weights: Same as `weights` arg after possible conversion, cast, and reshape.
original_shape: Shape of `labels` and `logits` before reshape.
Raises:
ValueError: If `labels` and `logits` do not have the same shape.
"""
# Convert `labels` and `logits` to Tensors and standardize dtypes.
logits
=
tf
.
convert_to_tensor
(
logits
,
name
=
'logits'
)
labels
=
util
.
convert_and_cast
(
labels
,
'labels'
,
logits
.
dtype
.
base_dtype
)
weights
=
util
.
convert_and_cast
(
weights
,
'weights'
,
logits
.
dtype
.
base_dtype
)
try
:
labels
.
get_shape
().
merge_with
(
logits
.
get_shape
())
except
ValueError
:
raise
ValueError
(
'logits and labels must have the same shape (%s vs %s)'
%
(
logits
.
get_shape
(),
labels
.
get_shape
()))
original_shape
=
labels
.
get_shape
().
as_list
()
if
labels
.
get_shape
().
ndims
>
0
:
original_shape
[
0
]
=
-
1
if
labels
.
get_shape
().
ndims
<=
1
:
labels
=
tf
.
reshape
(
labels
,
[
-
1
,
1
])
logits
=
tf
.
reshape
(
logits
,
[
-
1
,
1
])
if
weights
.
get_shape
().
ndims
==
1
:
# Weights has shape [batch_size]. Reshape to [batch_size, 1].
weights
=
tf
.
reshape
(
weights
,
[
-
1
,
1
])
if
weights
.
get_shape
().
ndims
==
0
:
# Weights is a scalar. Change shape of weights to match logits.
weights
*=
tf
.
ones_like
(
logits
)
return
labels
,
logits
,
weights
,
original_shape
def
_range_to_anchors_and_delta
(
precision_range
,
num_anchors
,
dtype
):
"""Calculates anchor points from precision range.
Args:
precision_range: As required in precision_recall_auc_loss.
num_anchors: int, number of equally spaced anchor points.
dtype: Data type of returned tensors.
Returns:
precision_values: A `Tensor` of data type dtype with equally spaced values
in the interval precision_range.
delta: The spacing between the values in precision_values.
Raises:
ValueError: If precision_range is invalid.
"""
# Validate precision_range.
if
not
0
<=
precision_range
[
0
]
<=
precision_range
[
-
1
]
<=
1
:
raise
ValueError
(
'precision values must obey 0 <= %f <= %f <= 1'
%
(
precision_range
[
0
],
precision_range
[
-
1
]))
if
not
0
<
len
(
precision_range
)
<
3
:
raise
ValueError
(
'length of precision_range (%d) must be 1 or 2'
%
len
(
precision_range
))
# Sets precision_values uniformly between min_precision and max_precision.
values
=
numpy
.
linspace
(
start
=
precision_range
[
0
],
stop
=
precision_range
[
1
],
num
=
num_anchors
+
2
)[
1
:
-
1
]
precision_values
=
util
.
convert_and_cast
(
values
,
'precision_values'
,
dtype
)
delta
=
util
.
convert_and_cast
(
values
[
0
]
-
precision_range
[
0
],
'delta'
,
dtype
)
# Makes precision_values [1, 1, num_anchors].
precision_values
=
util
.
expand_outer
(
precision_values
,
3
)
return
precision_values
,
delta
def
_create_dual_variable
(
name
,
shape
,
dtype
,
initializer
,
collections
,
trainable
,
dual_rate_factor
):
"""Creates a new dual variable.
Dual variables are required to be nonnegative. If trainable, their gradient
is reversed so that they are maximized (rather than minimized) by the
optimizer.
Args:
name: A string, the name for the new variable.
shape: Shape of the new variable.
dtype: Data type for the new variable.
initializer: Initializer for the new variable.
collections: List of graph collections keys. The new variable is added to
these collections. Defaults to `[GraphKeys.GLOBAL_VARIABLES]`.
trainable: If `True`, the default, also adds the variable to the graph
collection `GraphKeys.TRAINABLE_VARIABLES`. This collection is used as
the default list of variables to use by the `Optimizer` classes.
dual_rate_factor: A floating point value or `Tensor`. The learning rate for
the dual variable is scaled by this factor.
Returns:
dual_value: An op that computes the absolute value of the dual variable
and reverses its gradient.
dual_variable: The underlying variable itself.
"""
# We disable partitioning while constructing dual variables because they will
# be updated with assign, which is not available for partitioned variables.
partitioner
=
tf
.
get_variable_scope
().
partitioner
try
:
tf
.
get_variable_scope
().
set_partitioner
(
None
)
dual_variable
=
tf
.
contrib
.
framework
.
model_variable
(
name
=
name
,
shape
=
shape
,
dtype
=
dtype
,
initializer
=
initializer
,
collections
=
collections
,
trainable
=
trainable
)
finally
:
tf
.
get_variable_scope
().
set_partitioner
(
partitioner
)
# Using the absolute value enforces nonnegativity.
dual_value
=
tf
.
abs
(
dual_variable
)
if
trainable
:
# To reverse the gradient on the dual variable, multiply the gradient by
# -dual_rate_factor
dual_value
=
(
tf
.
stop_gradient
((
1.0
+
dual_rate_factor
)
*
dual_value
)
-
dual_rate_factor
*
dual_value
)
return
dual_value
,
dual_variable
def
maybe_create_label_priors
(
label_priors
,
labels
,
weights
,
variables_collections
):
"""Creates moving average ops to track label priors, if necessary.
Args:
label_priors: As required in e.g. precision_recall_auc_loss.
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
weights: As required in e.g. precision_recall_auc_loss.
variables_collections: Optional list of collections for the variables, if
any must be created.
Returns:
label_priors: A Tensor of shape [num_labels] consisting of the
weighted label priors, after updating with moving average ops if created.
"""
if
label_priors
is
not
None
:
label_priors
=
util
.
convert_and_cast
(
label_priors
,
name
=
'label_priors'
,
dtype
=
labels
.
dtype
.
base_dtype
)
return
tf
.
squeeze
(
label_priors
)
label_priors
=
util
.
build_label_priors
(
labels
,
weights
,
variables_collections
=
variables_collections
)
return
label_priors
def
true_positives_lower_bound
(
labels
,
logits
,
weights
,
surrogate_type
):
"""Calculate a lower bound on the number of true positives.
This lower bound on the number of true positives given `logits` and `labels`
is the same one used in the global objectives loss functions.
Args:
labels: A `Tensor` of shape [batch_size] or [batch_size, num_labels].
logits: A `Tensor` of shape [batch_size, num_labels] or
[batch_size, num_labels, num_anchors]. If the third dimension is present,
the lower bound is computed on each slice [:, :, k] independently.
weights: Per-example loss coefficients, with shape broadcast-compatible with
that of `labels`.
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions.
Returns:
A `Tensor` of shape [num_labels] or [num_labels, num_anchors].
"""
maybe_log2
=
tf
.
log
(
2.0
)
if
surrogate_type
==
'xent'
else
1.0
maybe_log2
=
tf
.
cast
(
maybe_log2
,
logits
.
dtype
.
base_dtype
)
if
logits
.
get_shape
().
ndims
==
3
and
labels
.
get_shape
().
ndims
<
3
:
labels
=
tf
.
expand_dims
(
labels
,
2
)
loss_on_positives
=
util
.
weighted_surrogate_loss
(
labels
,
logits
,
surrogate_type
,
negative_weights
=
0.0
)
/
maybe_log2
return
tf
.
reduce_sum
(
weights
*
(
labels
-
loss_on_positives
),
0
)
def
false_positives_upper_bound
(
labels
,
logits
,
weights
,
surrogate_type
):
"""Calculate an upper bound on the number of false positives.
This upper bound on the number of false positives given `logits` and `labels`
is the same one used in the global objectives loss functions.
Args:
labels: A `Tensor` of shape [batch_size, num_labels]
logits: A `Tensor` of shape [batch_size, num_labels] or
[batch_size, num_labels, num_anchors]. If the third dimension is present,
the lower bound is computed on each slice [:, :, k] independently.
weights: Per-example loss coefficients, with shape broadcast-compatible with
that of `labels`.
surrogate_type: Either 'xent' or 'hinge', specifying which upper bound
should be used for indicator functions.
Returns:
A `Tensor` of shape [num_labels] or [num_labels, num_anchors].
"""
maybe_log2
=
tf
.
log
(
2.0
)
if
surrogate_type
==
'xent'
else
1.0
maybe_log2
=
tf
.
cast
(
maybe_log2
,
logits
.
dtype
.
base_dtype
)
loss_on_negatives
=
util
.
weighted_surrogate_loss
(
labels
,
logits
,
surrogate_type
,
positive_weights
=
0.0
)
/
maybe_log2
return
tf
.
reduce_sum
(
weights
*
loss_on_negatives
,
0
)
research/global_objectives/loss_layers_example.py
0 → 100644
View file @
6c6f3f3a
# Copyright 2018 The TensorFlow Global Objectives Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Example for using global objectives.
Illustrate, using synthetic data, how using the precision_at_recall loss
significanly improves the performace of a linear classifier.
"""
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
# Dependency imports
import
numpy
as
np
from
sklearn.metrics
import
precision_score
import
tensorflow
as
tf
from
global_objectives
import
loss_layers
# When optimizing using global_objectives, if set to True then the saddle point
# optimization steps are performed internally by the Tensorflow optimizer,
# otherwise by dedicated saddle-point steps as part of the optimization loop.
USE_GO_SADDLE_POINT_OPT
=
False
TARGET_RECALL
=
0.98
TRAIN_ITERATIONS
=
150
LEARNING_RATE
=
1.0
GO_DUAL_RATE_FACTOR
=
15.0
NUM_CHECKPOINTS
=
6
EXPERIMENT_DATA_CONFIG
=
{
'positives_centers'
:
[[
0
,
1.0
],
[
1
,
-
0.5
]],
'negatives_centers'
:
[[
0
,
-
0.5
],
[
1
,
1.0
]],
'positives_variances'
:
[
0.15
,
0.1
],
'negatives_variances'
:
[
0.15
,
0.1
],
'positives_counts'
:
[
500
,
50
],
'negatives_counts'
:
[
3000
,
100
]
}
def
create_training_and_eval_data_for_experiment
(
**
data_config
):
"""Creates train and eval data sets.
Note: The synthesized binary-labeled data is a mixture of four Gaussians - two
positives and two negatives. The centers, variances, and sizes for each of
the two positives and negatives mixtures are passed in the respective keys
of data_config:
Args:
**data_config: Dictionary with Array entries as follows:
positives_centers - float [2,2] two centers of positives data sets.
negatives_centers - float [2,2] two centers of negatives data sets.
positives_variances - float [2] Variances for the positives sets.
negatives_variances - float [2] Variances for the negatives sets.
positives_counts - int [2] Counts for each of the two positives sets.
negatives_counts - int [2] Counts for each of the two negatives sets.
Returns:
A dictionary with two shuffled data sets created - one for training and one
for eval. The dictionary keys are 'train_data', 'train_labels', 'eval_data',
and 'eval_labels'. The data points are two-dimentional floats, and the
labels are in {0,1}.
"""
def
data_points
(
is_positives
,
index
):
variance
=
data_config
[
'positives_variances'
if
is_positives
else
'negatives_variances'
][
index
]
center
=
data_config
[
'positives_centers'
if
is_positives
else
'negatives_centers'
][
index
]
count
=
data_config
[
'positives_counts'
if
is_positives
else
'negatives_counts'
][
index
]
return
variance
*
np
.
random
.
randn
(
count
,
2
)
+
np
.
array
([
center
])
def
create_data
():
return
np
.
concatenate
([
data_points
(
False
,
0
),
data_points
(
True
,
0
),
data_points
(
True
,
1
),
data_points
(
False
,
1
)],
axis
=
0
)
def
create_labels
():
"""Creates an array of 0.0 or 1.0 labels for the data_config batches."""
return
np
.
array
([
0.0
]
*
data_config
[
'negatives_counts'
][
0
]
+
[
1.0
]
*
data_config
[
'positives_counts'
][
0
]
+
[
1.0
]
*
data_config
[
'positives_counts'
][
1
]
+
[
0.0
]
*
data_config
[
'negatives_counts'
][
1
])
permutation
=
np
.
random
.
permutation
(
sum
(
data_config
[
'positives_counts'
]
+
data_config
[
'negatives_counts'
]))
train_data
=
create_data
()[
permutation
,
:]
eval_data
=
create_data
()[
permutation
,
:]
train_labels
=
create_labels
()[
permutation
]
eval_labels
=
create_labels
()[
permutation
]
return
{
'train_data'
:
train_data
,
'train_labels'
:
train_labels
,
'eval_data'
:
eval_data
,
'eval_labels'
:
eval_labels
}
def
train_model
(
data
,
use_global_objectives
):
"""Trains a linear model for maximal accuracy or precision at given recall."""
def
precision_at_recall
(
scores
,
labels
,
target_recall
):
"""Computes precision - at target recall - over data."""
positive_scores
=
scores
[
labels
==
1.0
]
threshold
=
np
.
percentile
(
positive_scores
,
100
-
target_recall
*
100
)
predicted
=
scores
>=
threshold
return
precision_score
(
labels
,
predicted
)
w
=
tf
.
Variable
(
tf
.
constant
([
-
1.0
,
-
1.0
],
shape
=
[
2
,
1
]),
trainable
=
True
,
name
=
'weights'
,
dtype
=
tf
.
float32
)
b
=
tf
.
Variable
(
tf
.
zeros
([
1
]),
trainable
=
True
,
name
=
'biases'
,
dtype
=
tf
.
float32
)
logits
=
tf
.
matmul
(
tf
.
cast
(
data
[
'train_data'
],
tf
.
float32
),
w
)
+
b
labels
=
tf
.
constant
(
data
[
'train_labels'
],
shape
=
[
len
(
data
[
'train_labels'
]),
1
],
dtype
=
tf
.
float32
)
if
use_global_objectives
:
loss
,
other_outputs
=
loss_layers
.
precision_at_recall_loss
(
labels
,
logits
,
TARGET_RECALL
,
dual_rate_factor
=
GO_DUAL_RATE_FACTOR
)
loss
=
tf
.
reduce_mean
(
loss
)
else
:
loss
=
tf
.
reduce_mean
(
tf
.
nn
.
sigmoid_cross_entropy_with_logits
(
labels
=
labels
,
logits
=
logits
))
global_step
=
tf
.
Variable
(
0
,
trainable
=
False
)
learning_rate
=
tf
.
train
.
polynomial_decay
(
LEARNING_RATE
,
global_step
,
TRAIN_ITERATIONS
,
(
LEARNING_RATE
/
TRAIN_ITERATIONS
),
power
=
1.0
,
cycle
=
False
,
name
=
'learning_rate'
)
optimizer
=
tf
.
train
.
GradientDescentOptimizer
(
learning_rate
)
if
(
not
use_global_objectives
)
or
USE_GO_SADDLE_POINT_OPT
:
training_op
=
optimizer
.
minimize
(
loss
,
global_step
=
global_step
)
else
:
lambdas
=
other_outputs
[
'lambdas'
]
primal_update_op
=
optimizer
.
minimize
(
loss
,
var_list
=
[
w
,
b
])
dual_update_op
=
optimizer
.
minimize
(
loss
,
global_step
=
global_step
,
var_list
=
[
lambdas
])
# Training loop:
with
tf
.
Session
()
as
sess
:
checkpoint_step
=
TRAIN_ITERATIONS
//
NUM_CHECKPOINTS
sess
.
run
(
tf
.
global_variables_initializer
())
step
=
sess
.
run
(
global_step
)
while
step
<=
TRAIN_ITERATIONS
:
if
(
not
use_global_objectives
)
or
USE_GO_SADDLE_POINT_OPT
:
_
,
step
,
loss_value
,
w_value
,
b_value
=
sess
.
run
(
[
training_op
,
global_step
,
loss
,
w
,
b
])
else
:
_
,
w_value
,
b_value
=
sess
.
run
([
primal_update_op
,
w
,
b
])
_
,
loss_value
,
step
=
sess
.
run
([
dual_update_op
,
loss
,
global_step
])
if
use_global_objectives
:
go_outputs
=
sess
.
run
(
other_outputs
.
values
())
if
step
%
checkpoint_step
==
0
:
precision
=
precision_at_recall
(
np
.
dot
(
data
[
'train_data'
],
w_value
)
+
b_value
,
data
[
'train_labels'
],
TARGET_RECALL
)
tf
.
logging
.
info
(
'Loss = %f Precision = %f'
,
loss_value
,
precision
)
if
use_global_objectives
:
for
i
,
output_name
in
enumerate
(
other_outputs
.
keys
()):
tf
.
logging
.
info
(
'
\t
%s = %f'
,
output_name
,
go_outputs
[
i
])
w_value
,
b_value
=
sess
.
run
([
w
,
b
])
return
precision_at_recall
(
np
.
dot
(
data
[
'eval_data'
],
w_value
)
+
b_value
,
data
[
'eval_labels'
],
TARGET_RECALL
)
def
main
(
unused_argv
):
del
unused_argv
experiment_data
=
create_training_and_eval_data_for_experiment
(
**
EXPERIMENT_DATA_CONFIG
)
global_objectives_loss_precision
=
train_model
(
experiment_data
,
True
)
tf
.
logging
.
info
(
'global_objectives precision at requested recall is %f'
,
global_objectives_loss_precision
)
cross_entropy_loss_precision
=
train_model
(
experiment_data
,
False
)
tf
.
logging
.
info
(
'cross_entropy precision at requested recall is %f'
,
cross_entropy_loss_precision
)
if
__name__
==
'__main__'
:
tf
.
logging
.
set_verbosity
(
tf
.
logging
.
INFO
)
tf
.
app
.
run
()
research/global_objectives/loss_layers_test.py
0 → 100644
View file @
6c6f3f3a
# Copyright 2018 The TensorFlow Global Objectives Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for global objectives loss layers."""
# Dependency imports
from
absl.testing
import
parameterized
import
numpy
import
tensorflow
as
tf
from
global_objectives
import
loss_layers
from
global_objectives
import
util
# TODO: Include weights in the lagrange multiplier update tests.
class
PrecisionRecallAUCLossTest
(
parameterized
.
TestCase
,
tf
.
test
.
TestCase
):
@
parameterized
.
named_parameters
(
(
'_xent'
,
'xent'
,
0.7
),
(
'_hinge'
,
'hinge'
,
0.7
),
(
'_hinge_2'
,
'hinge'
,
0.5
)
)
def
testSinglePointAUC
(
self
,
surrogate_type
,
target_precision
):
# Tests a case with only one anchor point, where the loss should equal
# recall_at_precision_loss
batch_shape
=
[
10
,
2
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
labels
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
auc_loss
,
_
=
loss_layers
.
precision_recall_auc_loss
(
labels
,
logits
,
precision_range
=
(
target_precision
-
0.01
,
target_precision
+
0.01
),
num_anchors
=
1
,
surrogate_type
=
surrogate_type
)
point_loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
labels
,
logits
,
target_precision
=
target_precision
,
surrogate_type
=
surrogate_type
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
auc_loss
.
eval
(),
point_loss
.
eval
())
def
testThreePointAUC
(
self
):
# Tests a case with three anchor points against a weighted sum of recall
# at precision losses.
batch_shape
=
[
11
,
3
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
labels
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
# TODO: Place the hing/xent loss in a for loop.
auc_loss
,
_
=
loss_layers
.
precision_recall_auc_loss
(
labels
,
logits
,
num_anchors
=
1
)
first_point_loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
labels
,
logits
,
target_precision
=
0.25
)
second_point_loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
labels
,
logits
,
target_precision
=
0.5
)
third_point_loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
labels
,
logits
,
target_precision
=
0.75
)
expected_loss
=
(
first_point_loss
+
second_point_loss
+
third_point_loss
)
/
3
auc_loss_hinge
,
_
=
loss_layers
.
precision_recall_auc_loss
(
labels
,
logits
,
num_anchors
=
1
,
surrogate_type
=
'hinge'
)
first_point_hinge
,
_
=
loss_layers
.
recall_at_precision_loss
(
labels
,
logits
,
target_precision
=
0.25
,
surrogate_type
=
'hinge'
)
second_point_hinge
,
_
=
loss_layers
.
recall_at_precision_loss
(
labels
,
logits
,
target_precision
=
0.5
,
surrogate_type
=
'hinge'
)
third_point_hinge
,
_
=
loss_layers
.
recall_at_precision_loss
(
labels
,
logits
,
target_precision
=
0.75
,
surrogate_type
=
'hinge'
)
expected_hinge
=
(
first_point_hinge
+
second_point_hinge
+
third_point_hinge
)
/
3
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
auc_loss
.
eval
(),
expected_loss
.
eval
())
self
.
assertAllClose
(
auc_loss_hinge
.
eval
(),
expected_hinge
.
eval
())
def
testLagrangeMultiplierUpdateDirection
(
self
):
for
target_precision
in
[
0.35
,
0.65
]:
precision_range
=
(
target_precision
-
0.01
,
target_precision
+
0.01
)
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
kwargs
=
{
'precision_range'
:
precision_range
,
'num_anchors'
:
1
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
'pr-auc_{}_{}'
.
format
(
target_precision
,
surrogate_type
)}
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
precision_recall_auc_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
precision_recall_auc_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
class
ROCAUCLossTest
(
parameterized
.
TestCase
,
tf
.
test
.
TestCase
):
def
testSimpleScores
(
self
):
# Tests the loss on data with only one negative example with score zero.
# In this case, the loss should equal the surrogate loss on the scores with
# positive labels.
num_positives
=
10
scores_positives
=
tf
.
constant
(
3.0
*
numpy
.
random
.
randn
(
num_positives
),
shape
=
[
num_positives
,
1
])
labels
=
tf
.
constant
([
0.0
]
+
[
1.0
]
*
num_positives
,
shape
=
[
num_positives
+
1
,
1
])
scores
=
tf
.
concat
([[[
0.0
]],
scores_positives
],
0
)
loss
=
tf
.
reduce_sum
(
loss_layers
.
roc_auc_loss
(
labels
,
scores
,
surrogate_type
=
'hinge'
)[
0
])
expected_loss
=
tf
.
reduce_sum
(
tf
.
maximum
(
1.0
-
scores_positives
,
0
))
/
(
num_positives
+
1
)
with
self
.
test_session
():
self
.
assertAllClose
(
expected_loss
.
eval
(),
loss
.
eval
())
def
testRandomROCLoss
(
self
):
# Checks that random Bernoulli scores and labels has ~25% swaps.
shape
=
[
1000
,
30
]
scores
=
tf
.
constant
(
numpy
.
random
.
randint
(
0
,
2
,
size
=
shape
),
shape
=
shape
,
dtype
=
tf
.
float32
)
labels
=
tf
.
constant
(
numpy
.
random
.
randint
(
0
,
2
,
size
=
shape
),
shape
=
shape
,
dtype
=
tf
.
float32
)
loss
=
tf
.
reduce_mean
(
loss_layers
.
roc_auc_loss
(
labels
,
scores
,
surrogate_type
=
'hinge'
)[
0
])
with
self
.
test_session
():
self
.
assertAllClose
(
0.25
,
loss
.
eval
(),
1e-2
)
@
parameterized
.
named_parameters
(
(
'_zero_hinge'
,
'xent'
,
[
0.0
,
0.0
,
0.0
,
1.0
,
1.0
,
1.0
],
[
-
5.0
,
-
7.0
,
-
9.0
,
8.0
,
10.0
,
14.0
],
0.0
),
(
'_zero_xent'
,
'hinge'
,
[
0.0
,
0.0
,
0.0
,
1.0
,
1.0
,
1.0
],
[
-
0.2
,
0
,
-
0.1
,
1.0
,
1.1
,
1.0
],
0.0
),
(
'_xent'
,
'xent'
,
[
0.0
,
0.0
,
0.0
,
1.0
,
1.0
,
1.0
],
[
0.0
,
-
17.0
,
-
19.0
,
1.0
,
14.0
,
14.0
],
numpy
.
log
(
1.0
+
numpy
.
exp
(
-
1.0
))
/
6
),
(
'_hinge'
,
'hinge'
,
[
0.0
,
0.0
,
0.0
,
1.0
,
1.0
,
1.0
],
[
-
0.2
,
-
0.05
,
0.0
,
0.95
,
0.8
,
1.0
],
0.4
/
6
)
)
def
testManualROCLoss
(
self
,
surrogate_type
,
labels
,
logits
,
expected_value
):
labels
=
tf
.
constant
(
labels
)
logits
=
tf
.
constant
(
logits
)
loss
,
_
=
loss_layers
.
roc_auc_loss
(
labels
=
labels
,
logits
=
logits
,
surrogate_type
=
surrogate_type
)
with
self
.
test_session
():
self
.
assertAllClose
(
expected_value
,
tf
.
reduce_sum
(
loss
).
eval
())
def
testMultiLabelROCLoss
(
self
):
# Tests the loss on multi-label data against manually computed loss.
targets
=
numpy
.
array
([[
0.0
,
0.0
,
1.0
,
1.0
],
[
0.0
,
0.0
,
1.0
,
1.0
]])
scores
=
numpy
.
array
([[
0.1
,
1.0
,
1.1
,
1.0
],
[
1.0
,
0.0
,
1.3
,
1.1
]])
class_1_auc
=
tf
.
reduce_sum
(
loss_layers
.
roc_auc_loss
(
targets
[
0
],
scores
[
0
])[
0
])
class_2_auc
=
tf
.
reduce_sum
(
loss_layers
.
roc_auc_loss
(
targets
[
1
],
scores
[
1
])[
0
])
total_auc
=
tf
.
reduce_sum
(
loss_layers
.
roc_auc_loss
(
targets
.
transpose
(),
scores
.
transpose
())[
0
])
with
self
.
test_session
():
self
.
assertAllClose
(
total_auc
.
eval
(),
class_1_auc
.
eval
()
+
class_2_auc
.
eval
())
def
testWeights
(
self
):
# Test the loss with per-example weights.
# The logits_negatives below are repeated, so that setting half their
# weights to 2 and the other half to 0 should leave the loss unchanged.
logits_positives
=
tf
.
constant
([
2.54321
,
-
0.26
,
3.334334
],
shape
=
[
3
,
1
])
logits_negatives
=
tf
.
constant
([
-
0.6
,
1
,
-
1.3
,
-
1.3
,
-
0.6
,
1
],
shape
=
[
6
,
1
])
logits
=
tf
.
concat
([
logits_positives
,
logits_negatives
],
0
)
targets
=
tf
.
constant
([
1
,
1
,
1
,
0
,
0
,
0
,
0
,
0
,
0
],
shape
=
[
9
,
1
],
dtype
=
tf
.
float32
)
weights
=
tf
.
constant
([
1
,
1
,
1
,
0
,
0
,
0
,
2
,
2
,
2
],
shape
=
[
9
,
1
],
dtype
=
tf
.
float32
)
loss
=
tf
.
reduce_sum
(
loss_layers
.
roc_auc_loss
(
targets
,
logits
)[
0
])
weighted_loss
=
tf
.
reduce_sum
(
loss_layers
.
roc_auc_loss
(
targets
,
logits
,
weights
)[
0
])
with
self
.
test_session
():
self
.
assertAllClose
(
loss
.
eval
(),
weighted_loss
.
eval
())
class
RecallAtPrecisionTest
(
tf
.
test
.
TestCase
):
def
testEqualWeightLoss
(
self
):
# Tests a special case where the loss should equal cross entropy loss.
target_precision
=
1.0
num_labels
=
5
batch_shape
=
[
20
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.7
)))
label_priors
=
tf
.
constant
(
0.34
,
shape
=
[
num_labels
])
loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
targets
,
logits
,
target_precision
,
label_priors
=
label_priors
)
expected_loss
=
(
tf
.
contrib
.
nn
.
deprecated_flipped_sigmoid_cross_entropy_with_logits
(
logits
,
targets
))
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
loss_val
,
expected_val
=
session
.
run
([
loss
,
expected_loss
])
self
.
assertAllClose
(
loss_val
,
expected_val
)
def
testEqualWeightLossWithMultiplePrecisions
(
self
):
"""Tests a case where the loss equals xent loss with multiple precisions."""
target_precision
=
[
1.0
,
1.0
]
num_labels
=
2
batch_size
=
20
target_shape
=
[
batch_size
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
target_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
target_shape
),
0.7
)))
label_priors
=
tf
.
constant
([
0.34
],
shape
=
[
num_labels
])
loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
targets
,
logits
,
target_precision
,
label_priors
=
label_priors
,
surrogate_type
=
'xent'
,
)
expected_loss
=
(
tf
.
contrib
.
nn
.
deprecated_flipped_sigmoid_cross_entropy_with_logits
(
logits
,
targets
))
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
loss_val
,
expected_val
=
session
.
run
([
loss
,
expected_loss
])
self
.
assertAllClose
(
loss_val
,
expected_val
)
def
testPositivesOnlyLoss
(
self
):
# Tests a special case where the loss should equal cross entropy loss
# on the negatives only.
target_precision
=
1.0
num_labels
=
3
batch_shape
=
[
30
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
label_priors
=
tf
.
constant
(
0.45
,
shape
=
[
num_labels
])
loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
targets
,
logits
,
target_precision
,
label_priors
=
label_priors
,
lambdas_initializer
=
tf
.
zeros_initializer
())
expected_loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
positive_weights
=
1.0
,
negative_weights
=
0.0
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
loss_val
,
expected_val
=
session
.
run
([
loss
,
expected_loss
])
self
.
assertAllClose
(
loss_val
,
expected_val
)
def
testEquivalenceBetweenSingleAndMultiplePrecisions
(
self
):
"""Checks recall at precision with different precision values.
Runs recall at precision with multiple precision values, and runs each label
seperately with its own precision value as a scalar. Validates that the
returned loss values are the same.
"""
target_precision
=
[
0.2
,
0.9
,
0.4
]
num_labels
=
3
batch_shape
=
[
30
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
label_priors
=
tf
.
constant
([
0.45
,
0.8
,
0.3
],
shape
=
[
num_labels
])
multi_label_loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
targets
,
logits
,
target_precision
,
label_priors
=
label_priors
,
)
single_label_losses
=
[
loss_layers
.
recall_at_precision_loss
(
tf
.
expand_dims
(
targets
[:,
i
],
-
1
),
tf
.
expand_dims
(
logits
[:,
i
],
-
1
),
target_precision
[
i
],
label_priors
=
label_priors
[
i
])[
0
]
for
i
in
range
(
num_labels
)
]
single_label_losses
=
tf
.
concat
(
single_label_losses
,
1
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_label_loss_val
,
single_label_loss_val
=
session
.
run
(
[
multi_label_loss
,
single_label_losses
])
self
.
assertAllClose
(
multi_label_loss_val
,
single_label_loss_val
)
def
testEquivalenceBetweenSingleAndEqualMultiplePrecisions
(
self
):
"""Compares single and multiple target precisions with the same value.
Checks that using a single target precision and multiple target precisions
with the same value would result in the same loss value.
"""
num_labels
=
2
target_shape
=
[
20
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
target_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
target_shape
),
0.7
)))
label_priors
=
tf
.
constant
([
0.34
],
shape
=
[
num_labels
])
multi_precision_loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
targets
,
logits
,
[
0.75
,
0.75
],
label_priors
=
label_priors
,
surrogate_type
=
'xent'
,
)
single_precision_loss
,
_
=
loss_layers
.
recall_at_precision_loss
(
targets
,
logits
,
0.75
,
label_priors
=
label_priors
,
surrogate_type
=
'xent'
,
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_precision_loss_val
,
single_precision_loss_val
=
session
.
run
(
[
multi_precision_loss
,
single_precision_loss
])
self
.
assertAllClose
(
multi_precision_loss_val
,
single_precision_loss_val
)
def
testLagrangeMultiplierUpdateDirection
(
self
):
for
target_precision
in
[
0.35
,
0.65
]:
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
kwargs
=
{
'target_precision'
:
target_precision
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
'r-at-p_{}_{}'
.
format
(
target_precision
,
surrogate_type
)}
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
recall_at_precision_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
recall_at_precision_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
def
testLagrangeMultiplierUpdateDirectionWithMultiplePrecisions
(
self
):
"""Runs Lagrange multiplier test with multiple precision values."""
target_precision
=
[
0.65
,
0.35
]
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
scope_str
=
'r-at-p_{}_{}'
.
format
(
'_'
.
join
([
str
(
precision
)
for
precision
in
target_precision
]),
surrogate_type
)
kwargs
=
{
'target_precision'
:
target_precision
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
scope_str
,
}
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
recall_at_precision_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
recall_at_precision_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
class
PrecisionAtRecallTest
(
tf
.
test
.
TestCase
):
def
testCrossEntropyEquivalence
(
self
):
# Checks a special case where the loss should equal cross-entropy loss.
target_recall
=
1.0
num_labels
=
3
batch_shape
=
[
10
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
loss
,
_
=
loss_layers
.
precision_at_recall_loss
(
targets
,
logits
,
target_recall
,
lambdas_initializer
=
tf
.
constant_initializer
(
1.0
))
expected_loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
loss
.
eval
(),
expected_loss
.
eval
())
def
testNegativesOnlyLoss
(
self
):
# Checks a special case where the loss should equal the loss on
# the negative examples only.
target_recall
=
0.61828
num_labels
=
4
batch_shape
=
[
8
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.6
)))
loss
,
_
=
loss_layers
.
precision_at_recall_loss
(
targets
,
logits
,
target_recall
,
surrogate_type
=
'hinge'
,
lambdas_initializer
=
tf
.
constant_initializer
(
0.0
),
scope
=
'negatives_only_test'
)
expected_loss
=
util
.
weighted_hinge_loss
(
targets
,
logits
,
positive_weights
=
0.0
,
negative_weights
=
1.0
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
expected_loss
.
eval
(),
loss
.
eval
())
def
testLagrangeMultiplierUpdateDirection
(
self
):
for
target_recall
in
[
0.34
,
0.66
]:
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
kwargs
=
{
'target_recall'
:
target_recall
,
'dual_rate_factor'
:
1.0
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
'p-at-r_{}_{}'
.
format
(
target_recall
,
surrogate_type
)}
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
precision_at_recall_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
precision_at_recall_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
def
testCrossEntropyEquivalenceWithMultipleRecalls
(
self
):
"""Checks a case where the loss equals xent loss with multiple recalls."""
num_labels
=
3
target_recall
=
[
1.0
]
*
num_labels
batch_shape
=
[
10
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
loss
,
_
=
loss_layers
.
precision_at_recall_loss
(
targets
,
logits
,
target_recall
,
lambdas_initializer
=
tf
.
constant_initializer
(
1.0
))
expected_loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
loss
.
eval
(),
expected_loss
.
eval
())
def
testNegativesOnlyLossWithMultipleRecalls
(
self
):
"""Tests a case where the loss equals the loss on the negative examples.
Checks this special case using multiple target recall values.
"""
num_labels
=
4
target_recall
=
[
0.61828
]
*
num_labels
batch_shape
=
[
8
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.6
)))
loss
,
_
=
loss_layers
.
precision_at_recall_loss
(
targets
,
logits
,
target_recall
,
surrogate_type
=
'hinge'
,
lambdas_initializer
=
tf
.
constant_initializer
(
0.0
),
scope
=
'negatives_only_test'
)
expected_loss
=
util
.
weighted_hinge_loss
(
targets
,
logits
,
positive_weights
=
0.0
,
negative_weights
=
1.0
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
expected_loss
.
eval
(),
loss
.
eval
())
def
testLagrangeMultiplierUpdateDirectionWithMultipleRecalls
(
self
):
"""Runs Lagrange multiplier test with multiple recall values."""
target_recall
=
[
0.34
,
0.66
]
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
scope_str
=
'p-at-r_{}_{}'
.
format
(
'_'
.
join
([
str
(
recall
)
for
recall
in
target_recall
]),
surrogate_type
)
kwargs
=
{
'target_recall'
:
target_recall
,
'dual_rate_factor'
:
1.0
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
scope_str
}
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
precision_at_recall_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
loss_layers
.
precision_at_recall_loss
,
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
def
testEquivalenceBetweenSingleAndMultipleRecalls
(
self
):
"""Checks precision at recall with multiple different recall values.
Runs precision at recall with multiple recall values, and runs each label
seperately with its own recall value as a scalar. Validates that the
returned loss values are the same.
"""
target_precision
=
[
0.7
,
0.9
,
0.4
]
num_labels
=
3
batch_shape
=
[
30
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
label_priors
=
tf
.
constant
(
0.45
,
shape
=
[
num_labels
])
multi_label_loss
,
_
=
loss_layers
.
precision_at_recall_loss
(
targets
,
logits
,
target_precision
,
label_priors
=
label_priors
)
single_label_losses
=
[
loss_layers
.
precision_at_recall_loss
(
tf
.
expand_dims
(
targets
[:,
i
],
-
1
),
tf
.
expand_dims
(
logits
[:,
i
],
-
1
),
target_precision
[
i
],
label_priors
=
label_priors
[
i
])[
0
]
for
i
in
range
(
num_labels
)
]
single_label_losses
=
tf
.
concat
(
single_label_losses
,
1
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_label_loss_val
,
single_label_loss_val
=
session
.
run
(
[
multi_label_loss
,
single_label_losses
])
self
.
assertAllClose
(
multi_label_loss_val
,
single_label_loss_val
)
def
testEquivalenceBetweenSingleAndEqualMultipleRecalls
(
self
):
"""Compares single and multiple target recalls of the same value.
Checks that using a single target recall and multiple recalls with the
same value would result in the same loss value.
"""
num_labels
=
2
target_shape
=
[
20
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
target_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
target_shape
),
0.7
)))
label_priors
=
tf
.
constant
([
0.34
],
shape
=
[
num_labels
])
multi_precision_loss
,
_
=
loss_layers
.
precision_at_recall_loss
(
targets
,
logits
,
[
0.75
,
0.75
],
label_priors
=
label_priors
,
surrogate_type
=
'xent'
,
)
single_precision_loss
,
_
=
loss_layers
.
precision_at_recall_loss
(
targets
,
logits
,
0.75
,
label_priors
=
label_priors
,
surrogate_type
=
'xent'
,
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_precision_loss_val
,
single_precision_loss_val
=
session
.
run
(
[
multi_precision_loss
,
single_precision_loss
])
self
.
assertAllClose
(
multi_precision_loss_val
,
single_precision_loss_val
)
class
FalsePositiveRateAtTruePositiveRateTest
(
tf
.
test
.
TestCase
):
def
testNegativesOnlyLoss
(
self
):
# Checks a special case where the loss returned should be the loss on the
# negative examples.
target_recall
=
0.6
num_labels
=
3
batch_shape
=
[
3
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
label_priors
=
tf
.
constant
(
numpy
.
random
.
uniform
(
size
=
[
num_labels
]),
dtype
=
tf
.
float32
)
xent_loss
,
_
=
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
target_recall
,
label_priors
=
label_priors
,
lambdas_initializer
=
tf
.
constant_initializer
(
0.0
))
xent_expected
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
positive_weights
=
0.0
,
negative_weights
=
1.0
)
hinge_loss
,
_
=
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
target_recall
,
label_priors
=
label_priors
,
lambdas_initializer
=
tf
.
constant_initializer
(
0.0
),
surrogate_type
=
'hinge'
)
hinge_expected
=
util
.
weighted_hinge_loss
(
targets
,
logits
,
positive_weights
=
0.0
,
negative_weights
=
1.0
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
xent_val
,
xent_expected
=
session
.
run
([
xent_loss
,
xent_expected
])
self
.
assertAllClose
(
xent_val
,
xent_expected
)
hinge_val
,
hinge_expected
=
session
.
run
([
hinge_loss
,
hinge_expected
])
self
.
assertAllClose
(
hinge_val
,
hinge_expected
)
def
testPositivesOnlyLoss
(
self
):
# Checks a special case where the loss returned should be the loss on the
# positive examples only.
target_recall
=
1.0
num_labels
=
5
batch_shape
=
[
5
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
ones_like
(
logits
)
label_priors
=
tf
.
constant
(
numpy
.
random
.
uniform
(
size
=
[
num_labels
]),
dtype
=
tf
.
float32
)
loss
,
_
=
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
target_recall
,
label_priors
=
label_priors
)
expected_loss
=
tf
.
nn
.
sigmoid_cross_entropy_with_logits
(
labels
=
targets
,
logits
=
logits
)
hinge_loss
,
_
=
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
target_recall
,
label_priors
=
label_priors
,
surrogate_type
=
'hinge'
)
expected_hinge
=
util
.
weighted_hinge_loss
(
targets
,
logits
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
loss
.
eval
(),
expected_loss
.
eval
())
self
.
assertAllClose
(
hinge_loss
.
eval
(),
expected_hinge
.
eval
())
def
testEqualWeightLoss
(
self
):
# Checks a special case where the loss returned should be proportional to
# the ordinary loss.
target_recall
=
1.0
num_labels
=
4
batch_shape
=
[
40
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.6
)))
label_priors
=
tf
.
constant
(
0.5
,
shape
=
[
num_labels
])
loss
,
_
=
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
target_recall
,
label_priors
=
label_priors
)
expected_loss
=
tf
.
nn
.
sigmoid_cross_entropy_with_logits
(
labels
=
targets
,
logits
=
logits
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
loss
.
eval
(),
expected_loss
.
eval
())
def
testLagrangeMultiplierUpdateDirection
(
self
):
for
target_rate
in
[
0.35
,
0.65
]:
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
kwargs
=
{
'target_rate'
:
target_rate
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
'fpr-at-tpr_{}_{}'
.
format
(
target_rate
,
surrogate_type
)}
# True positive rate is a synonym for recall, so we use the
# recall constraint data.
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
def
testLagrangeMultiplierUpdateDirectionWithMultipleRates
(
self
):
"""Runs Lagrange multiplier test with multiple target rates."""
target_rate
=
[
0.35
,
0.65
]
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
kwargs
=
{
'target_rate'
:
target_rate
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
'fpr-at-tpr_{}_{}'
.
format
(
'_'
.
join
([
str
(
target
)
for
target
in
target_rate
]),
surrogate_type
)}
# True positive rate is a synonym for recall, so we use the
# recall constraint data.
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
def
testEquivalenceBetweenSingleAndEqualMultipleRates
(
self
):
"""Compares single and multiple target rates of the same value.
Checks that using a single target rate and multiple rates with the
same value would result in the same loss value.
"""
num_labels
=
2
target_shape
=
[
20
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
target_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
target_shape
),
0.7
)))
label_priors
=
tf
.
constant
([
0.34
],
shape
=
[
num_labels
])
multi_label_loss
,
_
=
(
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
[
0.75
,
0.75
],
label_priors
=
label_priors
))
single_label_loss
,
_
=
(
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
0.75
,
label_priors
=
label_priors
))
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_label_loss_val
,
single_label_loss_val
=
session
.
run
(
[
multi_label_loss
,
single_label_loss
])
self
.
assertAllClose
(
multi_label_loss_val
,
single_label_loss_val
)
def
testEquivalenceBetweenSingleAndMultipleRates
(
self
):
"""Compares single and multiple target rates of different values.
Runs false_positive_rate_at_true_positive_rate_loss with multiple target
rates, and runs each label seperately with its own target rate as a
scalar. Validates that the returned loss values are the same.
"""
target_precision
=
[
0.7
,
0.9
,
0.4
]
num_labels
=
3
batch_shape
=
[
30
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
label_priors
=
tf
.
constant
(
0.45
,
shape
=
[
num_labels
])
multi_label_loss
,
_
=
(
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
targets
,
logits
,
target_precision
,
label_priors
=
label_priors
))
single_label_losses
=
[
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
(
tf
.
expand_dims
(
targets
[:,
i
],
-
1
),
tf
.
expand_dims
(
logits
[:,
i
],
-
1
),
target_precision
[
i
],
label_priors
=
label_priors
[
i
])[
0
]
for
i
in
range
(
num_labels
)
]
single_label_losses
=
tf
.
concat
(
single_label_losses
,
1
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_label_loss_val
,
single_label_loss_val
=
session
.
run
(
[
multi_label_loss
,
single_label_losses
])
self
.
assertAllClose
(
multi_label_loss_val
,
single_label_loss_val
)
class
TruePositiveRateAtFalsePositiveRateTest
(
tf
.
test
.
TestCase
):
def
testPositivesOnlyLoss
(
self
):
# A special case where the loss should equal the loss on the positive
# examples.
target_rate
=
numpy
.
random
.
uniform
()
num_labels
=
3
batch_shape
=
[
20
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.6
)))
label_priors
=
tf
.
constant
(
numpy
.
random
.
uniform
(
size
=
[
num_labels
]),
dtype
=
tf
.
float32
)
xent_loss
,
_
=
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
targets
,
logits
,
target_rate
,
label_priors
=
label_priors
,
lambdas_initializer
=
tf
.
constant_initializer
(
0.0
))
xent_expected
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
positive_weights
=
1.0
,
negative_weights
=
0.0
)
hinge_loss
,
_
=
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
targets
,
logits
,
target_rate
,
label_priors
=
label_priors
,
lambdas_initializer
=
tf
.
constant_initializer
(
0.0
),
surrogate_type
=
'hinge'
)
hinge_expected
=
util
.
weighted_hinge_loss
(
targets
,
logits
,
positive_weights
=
1.0
,
negative_weights
=
0.0
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
xent_expected
.
eval
(),
xent_loss
.
eval
())
self
.
assertAllClose
(
hinge_expected
.
eval
(),
hinge_loss
.
eval
())
def
testNegativesOnlyLoss
(
self
):
# A special case where the loss should equal the loss on the negative
# examples, minus target_rate * (1 - label_priors) * maybe_log2.
target_rate
=
numpy
.
random
.
uniform
()
num_labels
=
3
batch_shape
=
[
25
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
zeros_like
(
logits
)
label_priors
=
tf
.
constant
(
numpy
.
random
.
uniform
(
size
=
[
num_labels
]),
dtype
=
tf
.
float32
)
xent_loss
,
_
=
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
targets
,
logits
,
target_rate
,
label_priors
=
label_priors
)
xent_expected
=
tf
.
subtract
(
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
positive_weights
=
0.0
,
negative_weights
=
1.0
),
target_rate
*
(
1.0
-
label_priors
)
*
numpy
.
log
(
2
))
hinge_loss
,
_
=
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
targets
,
logits
,
target_rate
,
label_priors
=
label_priors
,
surrogate_type
=
'hinge'
)
hinge_expected
=
util
.
weighted_hinge_loss
(
targets
,
logits
)
-
target_rate
*
(
1.0
-
label_priors
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
xent_expected
.
eval
(),
xent_loss
.
eval
())
self
.
assertAllClose
(
hinge_expected
.
eval
(),
hinge_loss
.
eval
())
def
testLagrangeMultiplierUpdateDirection
(
self
):
for
target_rate
in
[
0.35
,
0.65
]:
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
kwargs
=
{
'target_rate'
:
target_rate
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
'tpr-at-fpr_{}_{}'
.
format
(
target_rate
,
surrogate_type
)}
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
def
testLagrangeMultiplierUpdateDirectionWithMultipleRates
(
self
):
"""Runs Lagrange multiplier test with multiple target rates."""
target_rate
=
[
0.35
,
0.65
]
for
surrogate_type
in
[
'xent'
,
'hinge'
]:
kwargs
=
{
'target_rate'
:
target_rate
,
'surrogate_type'
:
surrogate_type
,
'scope'
:
'tpr-at-fpr_{}_{}'
.
format
(
'_'
.
join
([
str
(
target
)
for
target
in
target_rate
]),
surrogate_type
)}
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_multilabel_data
,
test_object
=
self
)
kwargs
[
'scope'
]
=
'other-'
+
kwargs
[
'scope'
]
run_lagrange_multiplier_test
(
global_objective
=
(
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
),
objective_kwargs
=
kwargs
,
data_builder
=
_other_multilabel_data
(
surrogate_type
),
test_object
=
self
)
def
testEquivalenceBetweenSingleAndEqualMultipleRates
(
self
):
"""Compares single and multiple target rates of the same value.
Checks that using a single target rate and multiple rates with the
same value would result in the same loss value.
"""
num_labels
=
2
target_shape
=
[
20
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
target_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
target_shape
),
0.7
)))
label_priors
=
tf
.
constant
([
0.34
],
shape
=
[
num_labels
])
multi_label_loss
,
_
=
(
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
targets
,
logits
,
[
0.75
,
0.75
],
label_priors
=
label_priors
))
single_label_loss
,
_
=
(
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
targets
,
logits
,
0.75
,
label_priors
=
label_priors
))
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_label_loss_val
,
single_label_loss_val
=
session
.
run
(
[
multi_label_loss
,
single_label_loss
])
self
.
assertAllClose
(
multi_label_loss_val
,
single_label_loss_val
)
def
testEquivalenceBetweenSingleAndMultipleRates
(
self
):
"""Compares single and multiple target rates of different values.
Runs true_positive_rate_at_false_positive_rate_loss with multiple target
rates, and runs each label seperately with its own target rate as a
scalar. Validates that the returned loss values are the same.
"""
target_precision
=
[
0.7
,
0.9
,
0.4
]
num_labels
=
3
batch_shape
=
[
30
,
num_labels
]
logits
=
tf
.
Variable
(
tf
.
random_normal
(
batch_shape
))
targets
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
label_priors
=
tf
.
constant
(
0.45
,
shape
=
[
num_labels
])
multi_label_loss
,
_
=
(
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
targets
,
logits
,
target_precision
,
label_priors
=
label_priors
))
single_label_losses
=
[
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
(
tf
.
expand_dims
(
targets
[:,
i
],
-
1
),
tf
.
expand_dims
(
logits
[:,
i
],
-
1
),
target_precision
[
i
],
label_priors
=
label_priors
[
i
])[
0
]
for
i
in
range
(
num_labels
)
]
single_label_losses
=
tf
.
concat
(
single_label_losses
,
1
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
multi_label_loss_val
,
single_label_loss_val
=
session
.
run
(
[
multi_label_loss
,
single_label_losses
])
self
.
assertAllClose
(
multi_label_loss_val
,
single_label_loss_val
)
class
UtilityFunctionsTest
(
tf
.
test
.
TestCase
):
def
testTrainableDualVariable
(
self
):
# Confirm correct behavior of a trainable dual variable.
x
=
tf
.
get_variable
(
'primal'
,
dtype
=
tf
.
float32
,
initializer
=
2.0
)
y_value
,
y
=
loss_layers
.
_create_dual_variable
(
'dual'
,
shape
=
None
,
dtype
=
tf
.
float32
,
initializer
=
1.0
,
collections
=
None
,
trainable
=
True
,
dual_rate_factor
=
0.3
)
optimizer
=
tf
.
train
.
GradientDescentOptimizer
(
learning_rate
=
1.0
)
update
=
optimizer
.
minimize
(
0.5
*
tf
.
square
(
x
-
y_value
))
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
update
.
run
()
self
.
assertAllClose
(
0.7
,
y
.
eval
())
def
testUntrainableDualVariable
(
self
):
# Confirm correct behavior of dual variable which is not trainable.
x
=
tf
.
get_variable
(
'primal'
,
dtype
=
tf
.
float32
,
initializer
=-
2.0
)
y_value
,
y
=
loss_layers
.
_create_dual_variable
(
'dual'
,
shape
=
None
,
dtype
=
tf
.
float32
,
initializer
=
1.0
,
collections
=
None
,
trainable
=
False
,
dual_rate_factor
=
0.8
)
optimizer
=
tf
.
train
.
GradientDescentOptimizer
(
learning_rate
=
1.0
)
update
=
optimizer
.
minimize
(
tf
.
square
(
x
)
*
y_value
+
tf
.
exp
(
y_value
))
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
update
.
run
()
self
.
assertAllClose
(
1.0
,
y
.
eval
())
class
BoundTest
(
parameterized
.
TestCase
,
tf
.
test
.
TestCase
):
@
parameterized
.
named_parameters
(
(
'_xent'
,
'xent'
,
1.0
,
[
2.0
,
1.0
]),
(
'_xent_weighted'
,
'xent'
,
numpy
.
array
([
0
,
2
,
0.5
,
1
,
2
,
3
]).
reshape
(
6
,
1
),
[
2.5
,
0
]),
(
'_hinge'
,
'hinge'
,
1.0
,
[
2.0
,
1.0
]),
(
'_hinge_weighted'
,
'hinge'
,
numpy
.
array
([
1.0
,
2
,
3
,
4
,
5
,
6
]).
reshape
(
6
,
1
),
[
5.0
,
1
]))
def
testLowerBoundMultilabel
(
self
,
surrogate_type
,
weights
,
expected
):
labels
,
logits
,
_
=
_multilabel_data
()
lower_bound
=
loss_layers
.
true_positives_lower_bound
(
labels
,
logits
,
weights
,
surrogate_type
)
with
self
.
test_session
():
self
.
assertAllClose
(
lower_bound
.
eval
(),
expected
)
@
parameterized
.
named_parameters
(
(
'_xent'
,
'xent'
),
(
'_hinge'
,
'hinge'
))
def
testLowerBoundOtherMultilabel
(
self
,
surrogate_type
):
labels
,
logits
,
_
=
_other_multilabel_data
(
surrogate_type
)()
lower_bound
=
loss_layers
.
true_positives_lower_bound
(
labels
,
logits
,
1.0
,
surrogate_type
)
with
self
.
test_session
():
self
.
assertAllClose
(
lower_bound
.
eval
(),
[
4.0
,
2.0
],
atol
=
1e-5
)
@
parameterized
.
named_parameters
(
(
'_xent'
,
'xent'
,
1.0
,
[
1.0
,
2.0
]),
(
'_xent_weighted'
,
'xent'
,
numpy
.
array
([
3.0
,
2
,
1
,
0
,
1
,
2
]).
reshape
(
6
,
1
),
[
2.0
,
1.0
]),
(
'_hinge'
,
'hinge'
,
1.0
,
[
1.0
,
2.0
]),
(
'_hinge_weighted'
,
'hinge'
,
numpy
.
array
([
13
,
12
,
11
,
0.5
,
0
,
0.5
]).
reshape
(
6
,
1
),
[
0.5
,
0.5
]))
def
testUpperBoundMultilabel
(
self
,
surrogate_type
,
weights
,
expected
):
labels
,
logits
,
_
=
_multilabel_data
()
upper_bound
=
loss_layers
.
false_positives_upper_bound
(
labels
,
logits
,
weights
,
surrogate_type
)
with
self
.
test_session
():
self
.
assertAllClose
(
upper_bound
.
eval
(),
expected
)
@
parameterized
.
named_parameters
(
(
'_xent'
,
'xent'
),
(
'_hinge'
,
'hinge'
))
def
testUpperBoundOtherMultilabel
(
self
,
surrogate_type
):
labels
,
logits
,
_
=
_other_multilabel_data
(
surrogate_type
)()
upper_bound
=
loss_layers
.
false_positives_upper_bound
(
labels
,
logits
,
1.0
,
surrogate_type
)
with
self
.
test_session
():
self
.
assertAllClose
(
upper_bound
.
eval
(),
[
2.0
,
4.0
],
atol
=
1e-5
)
@
parameterized
.
named_parameters
(
(
'_lower'
,
'lower'
),
(
'_upper'
,
'upper'
))
def
testThreeDimensionalLogits
(
self
,
bound
):
bound_function
=
loss_layers
.
false_positives_upper_bound
if
bound
==
'lower'
:
bound_function
=
loss_layers
.
true_positives_lower_bound
random_labels
=
numpy
.
float32
(
numpy
.
random
.
uniform
(
size
=
[
2
,
3
])
>
0.5
)
random_logits
=
numpy
.
float32
(
numpy
.
random
.
randn
(
2
,
3
,
2
))
first_slice_logits
=
random_logits
[:,
:,
0
].
reshape
(
2
,
3
)
second_slice_logits
=
random_logits
[:,
:,
1
].
reshape
(
2
,
3
)
full_bound
=
bound_function
(
tf
.
constant
(
random_labels
),
tf
.
constant
(
random_logits
),
1.0
,
'xent'
)
first_slice_bound
=
bound_function
(
tf
.
constant
(
random_labels
),
tf
.
constant
(
first_slice_logits
),
1.0
,
'xent'
)
second_slice_bound
=
bound_function
(
tf
.
constant
(
random_labels
),
tf
.
constant
(
second_slice_logits
),
1.0
,
'xent'
)
stacked_bound
=
tf
.
stack
([
first_slice_bound
,
second_slice_bound
],
axis
=
1
)
with
self
.
test_session
():
self
.
assertAllClose
(
full_bound
.
eval
(),
stacked_bound
.
eval
())
def
run_lagrange_multiplier_test
(
global_objective
,
objective_kwargs
,
data_builder
,
test_object
):
"""Runs a test for the Lagrange multiplier update of `global_objective`.
The test checks that the constraint for `global_objective` is satisfied on
the first label of the data produced by `data_builder` but not the second.
Args:
global_objective: One of the global objectives.
objective_kwargs: A dictionary of keyword arguments to pass to
`global_objective`. Must contain an entry for the constraint argument
of `global_objective`, e.g. 'target_rate' or 'target_precision'.
data_builder: A function which returns tensors corresponding to labels,
logits, and label priors.
test_object: An instance of tf.test.TestCase.
"""
# Construct global objective kwargs from a copy of `objective_kwargs`.
kwargs
=
dict
(
objective_kwargs
)
targets
,
logits
,
priors
=
data_builder
()
kwargs
[
'labels'
]
=
targets
kwargs
[
'logits'
]
=
logits
kwargs
[
'label_priors'
]
=
priors
loss
,
output_dict
=
global_objective
(
**
kwargs
)
lambdas
=
tf
.
squeeze
(
output_dict
[
'lambdas'
])
opt
=
tf
.
train
.
GradientDescentOptimizer
(
learning_rate
=
0.1
)
update_op
=
opt
.
minimize
(
loss
,
var_list
=
[
output_dict
[
'lambdas'
]])
with
test_object
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
lambdas_before
=
session
.
run
(
lambdas
)
session
.
run
(
update_op
)
lambdas_after
=
session
.
run
(
lambdas
)
test_object
.
assertLess
(
lambdas_after
[
0
],
lambdas_before
[
0
])
test_object
.
assertGreater
(
lambdas_after
[
1
],
lambdas_before
[
1
])
class
CrossFunctionTest
(
parameterized
.
TestCase
,
tf
.
test
.
TestCase
):
@
parameterized
.
named_parameters
(
(
'_auc01xent'
,
loss_layers
.
precision_recall_auc_loss
,
{
'precision_range'
:
(
0.0
,
1.0
),
'surrogate_type'
:
'xent'
}),
(
'_auc051xent'
,
loss_layers
.
precision_recall_auc_loss
,
{
'precision_range'
:
(
0.5
,
1.0
),
'surrogate_type'
:
'xent'
}),
(
'_auc01)hinge'
,
loss_layers
.
precision_recall_auc_loss
,
{
'precision_range'
:
(
0.0
,
1.0
),
'surrogate_type'
:
'hinge'
}),
(
'_ratp04'
,
loss_layers
.
recall_at_precision_loss
,
{
'target_precision'
:
0.4
,
'surrogate_type'
:
'xent'
}),
(
'_ratp066'
,
loss_layers
.
recall_at_precision_loss
,
{
'target_precision'
:
0.66
,
'surrogate_type'
:
'xent'
}),
(
'_ratp07_hinge'
,
loss_layers
.
recall_at_precision_loss
,
{
'target_precision'
:
0.7
,
'surrogate_type'
:
'hinge'
}),
(
'_fpattp066'
,
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
,
{
'target_rate'
:
0.66
,
'surrogate_type'
:
'xent'
}),
(
'_fpattp046'
,
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
,
{
'target_rate'
:
0.46
,
'surrogate_type'
:
'xent'
}),
(
'_fpattp076_hinge'
,
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
,
{
'target_rate'
:
0.76
,
'surrogate_type'
:
'hinge'
}),
(
'_fpattp036_hinge'
,
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
,
{
'target_rate'
:
0.36
,
'surrogate_type'
:
'hinge'
}),
)
def
testWeigtedGlobalObjective
(
self
,
global_objective
,
objective_kwargs
):
"""Runs a test of `global_objective` with per-example weights.
Args:
global_objective: One of the global objectives.
objective_kwargs: A dictionary of keyword arguments to pass to
`global_objective`. Must contain keys 'surrogate_type', and the keyword
for the constraint argument of `global_objective`, e.g. 'target_rate' or
'target_precision'.
"""
logits_positives
=
tf
.
constant
([
1
,
-
0.5
,
3
],
shape
=
[
3
,
1
])
logits_negatives
=
tf
.
constant
([
-
0.5
,
1
,
-
1
,
-
1
,
-
0.5
,
1
],
shape
=
[
6
,
1
])
# Dummy tensor is used to compute the gradients.
dummy
=
tf
.
constant
(
1.0
)
logits
=
tf
.
concat
([
logits_positives
,
logits_negatives
],
0
)
logits
=
tf
.
multiply
(
logits
,
dummy
)
targets
=
tf
.
constant
([
1
,
1
,
1
,
0
,
0
,
0
,
0
,
0
,
0
],
shape
=
[
9
,
1
],
dtype
=
tf
.
float32
)
priors
=
tf
.
constant
(
1.0
/
3.0
,
shape
=
[
1
])
weights
=
tf
.
constant
([
1
,
1
,
1
,
0
,
0
,
0
,
2
,
2
,
2
],
shape
=
[
9
,
1
],
dtype
=
tf
.
float32
)
# Construct global objective kwargs.
objective_kwargs
[
'labels'
]
=
targets
objective_kwargs
[
'logits'
]
=
logits
objective_kwargs
[
'label_priors'
]
=
priors
scope
=
'weighted_test'
# Unweighted loss.
objective_kwargs
[
'scope'
]
=
scope
+
'_plain'
raw_loss
,
update
=
global_objective
(
**
objective_kwargs
)
loss
=
tf
.
reduce_sum
(
raw_loss
)
# Weighted loss.
objective_kwargs
[
'weights'
]
=
weights
objective_kwargs
[
'scope'
]
=
scope
+
'_weighted'
raw_weighted_loss
,
weighted_update
=
global_objective
(
**
objective_kwargs
)
weighted_loss
=
tf
.
reduce_sum
(
raw_weighted_loss
)
lambdas
=
tf
.
contrib
.
framework
.
get_unique_variable
(
scope
+
'_plain/lambdas'
)
weighted_lambdas
=
tf
.
contrib
.
framework
.
get_unique_variable
(
scope
+
'_weighted/lambdas'
)
logits_gradient
=
tf
.
gradients
(
loss
,
dummy
)
weighted_logits_gradient
=
tf
.
gradients
(
weighted_loss
,
dummy
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
loss
.
eval
(),
weighted_loss
.
eval
())
logits_grad
,
weighted_logits_grad
=
session
.
run
(
[
logits_gradient
,
weighted_logits_gradient
])
self
.
assertAllClose
(
logits_grad
,
weighted_logits_grad
)
session
.
run
([
update
,
weighted_update
])
lambdas_value
,
weighted_lambdas_value
=
session
.
run
(
[
lambdas
,
weighted_lambdas
])
self
.
assertAllClose
(
lambdas_value
,
weighted_lambdas_value
)
@
parameterized
.
named_parameters
(
(
'_prauc051xent'
,
loss_layers
.
precision_recall_auc_loss
,
{
'precision_range'
:
(
0.5
,
1.0
),
'surrogate_type'
:
'xent'
}),
(
'_prauc01hinge'
,
loss_layers
.
precision_recall_auc_loss
,
{
'precision_range'
:
(
0.0
,
1.0
),
'surrogate_type'
:
'hinge'
}),
(
'_rocxent'
,
loss_layers
.
roc_auc_loss
,
{
'surrogate_type'
:
'xent'
}),
(
'_rochinge'
,
loss_layers
.
roc_auc_loss
,
{
'surrogate_type'
:
'xent'
}),
(
'_ratp04'
,
loss_layers
.
recall_at_precision_loss
,
{
'target_precision'
:
0.4
,
'surrogate_type'
:
'xent'
}),
(
'_ratp07_hinge'
,
loss_layers
.
recall_at_precision_loss
,
{
'target_precision'
:
0.7
,
'surrogate_type'
:
'hinge'
}),
(
'_patr05'
,
loss_layers
.
precision_at_recall_loss
,
{
'target_recall'
:
0.4
,
'surrogate_type'
:
'xent'
}),
(
'_patr08_hinge'
,
loss_layers
.
precision_at_recall_loss
,
{
'target_recall'
:
0.7
,
'surrogate_type'
:
'hinge'
}),
(
'_fpattp046'
,
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
,
{
'target_rate'
:
0.46
,
'surrogate_type'
:
'xent'
}),
(
'_fpattp036_hinge'
,
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
,
{
'target_rate'
:
0.36
,
'surrogate_type'
:
'hinge'
}),
(
'_tpatfp076'
,
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
,
{
'target_rate'
:
0.76
,
'surrogate_type'
:
'xent'
}),
(
'_tpatfp036_hinge'
,
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
,
{
'target_rate'
:
0.36
,
'surrogate_type'
:
'hinge'
}),
)
def
testVectorAndMatrixLabelEquivalence
(
self
,
global_objective
,
objective_kwargs
):
"""Tests equivalence between label shape [batch_size] or [batch_size, 1]."""
vector_labels
=
tf
.
constant
([
1.0
,
1.0
,
0.0
,
0.0
],
shape
=
[
4
])
vector_logits
=
tf
.
constant
([
1.0
,
0.1
,
0.1
,
-
1.0
],
shape
=
[
4
])
# Construct vector global objective kwargs and loss.
vector_kwargs
=
objective_kwargs
.
copy
()
vector_kwargs
[
'labels'
]
=
vector_labels
vector_kwargs
[
'logits'
]
=
vector_logits
vector_loss
,
_
=
global_objective
(
**
vector_kwargs
)
vector_loss_sum
=
tf
.
reduce_sum
(
vector_loss
)
# Construct matrix global objective kwargs and loss.
matrix_kwargs
=
objective_kwargs
.
copy
()
matrix_kwargs
[
'labels'
]
=
tf
.
expand_dims
(
vector_labels
,
1
)
matrix_kwargs
[
'logits'
]
=
tf
.
expand_dims
(
vector_logits
,
1
)
matrix_loss
,
_
=
global_objective
(
**
matrix_kwargs
)
matrix_loss_sum
=
tf
.
reduce_sum
(
matrix_loss
)
self
.
assertEqual
(
1
,
vector_loss
.
get_shape
().
ndims
)
self
.
assertEqual
(
2
,
matrix_loss
.
get_shape
().
ndims
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
vector_loss_sum
.
eval
(),
matrix_loss_sum
.
eval
())
@
parameterized
.
named_parameters
(
(
'_prauc'
,
loss_layers
.
precision_recall_auc_loss
,
None
),
(
'_roc'
,
loss_layers
.
roc_auc_loss
,
None
),
(
'_rap'
,
loss_layers
.
recall_at_precision_loss
,
{
'target_precision'
:
0.8
}),
(
'_patr'
,
loss_layers
.
precision_at_recall_loss
,
{
'target_recall'
:
0.7
}),
(
'_fpattp'
,
loss_layers
.
false_positive_rate_at_true_positive_rate_loss
,
{
'target_rate'
:
0.9
}),
(
'_tpatfp'
,
loss_layers
.
true_positive_rate_at_false_positive_rate_loss
,
{
'target_rate'
:
0.1
})
)
def
testUnknownBatchSize
(
self
,
global_objective
,
objective_kwargs
):
# Tests that there are no errors when the batch size is not known.
batch_shape
=
[
5
,
2
]
logits
=
tf
.
placeholder
(
tf
.
float32
)
logits_feed
=
numpy
.
random
.
randn
(
*
batch_shape
)
labels
=
tf
.
placeholder
(
tf
.
float32
)
labels_feed
=
logits_feed
>
0.1
logits
.
set_shape
([
None
,
2
])
labels
.
set_shape
([
None
,
2
])
if
objective_kwargs
is
None
:
objective_kwargs
=
{}
placeholder_kwargs
=
objective_kwargs
.
copy
()
placeholder_kwargs
[
'labels'
]
=
labels
placeholder_kwargs
[
'logits'
]
=
logits
placeholder_loss
,
_
=
global_objective
(
**
placeholder_kwargs
)
kwargs
=
objective_kwargs
.
copy
()
kwargs
[
'labels'
]
=
labels_feed
kwargs
[
'logits'
]
=
logits_feed
loss
,
_
=
global_objective
(
**
kwargs
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
feed_loss_val
=
session
.
run
(
placeholder_loss
,
feed_dict
=
{
logits
:
logits_feed
,
labels
:
labels_feed
})
loss_val
=
session
.
run
(
loss
)
self
.
assertAllClose
(
feed_loss_val
,
loss_val
)
# Both sets of logits below are designed so that the surrogate precision and
# recall (true positive rate) of class 1 is ~ 2/3, and the same surrogates for
# class 2 are ~ 1/3. The false positive rate surrogates are ~ 1/3 and 2/3.
def
_multilabel_data
():
targets
=
tf
.
constant
([
1.0
,
1.0
,
1.0
,
0.0
,
0.0
,
0.0
],
shape
=
[
6
,
1
])
targets
=
tf
.
concat
([
targets
,
targets
],
1
)
logits_positives
=
tf
.
constant
([[
0.0
,
15
],
[
16
,
0.0
],
[
14
,
0.0
]],
shape
=
[
3
,
2
])
logits_negatives
=
tf
.
constant
([[
-
17
,
0.0
],
[
-
15
,
0.0
],
[
0.0
,
-
101
]],
shape
=
[
3
,
2
])
logits
=
tf
.
concat
([
logits_positives
,
logits_negatives
],
0
)
priors
=
tf
.
constant
(
0.5
,
shape
=
[
2
])
return
targets
,
logits
,
priors
def
_other_multilabel_data
(
surrogate_type
):
targets
=
tf
.
constant
(
[
1.0
]
*
6
+
[
0.0
]
*
6
,
shape
=
[
12
,
1
])
targets
=
tf
.
concat
([
targets
,
targets
],
1
)
logits_positives
=
tf
.
constant
([[
0.0
,
13
],
[
12
,
0.0
],
[
15
,
0.0
],
[
0.0
,
30
],
[
13
,
0.0
],
[
18
,
0.0
]],
shape
=
[
6
,
2
])
# A score of cost_2 incurs a loss of ~2.0.
cost_2
=
1.0
if
surrogate_type
==
'hinge'
else
1.09861229
logits_negatives
=
tf
.
constant
([[
-
16
,
cost_2
],
[
-
15
,
cost_2
],
[
cost_2
,
-
111
],
[
-
133
,
-
14
,],
[
-
14.0100101
,
-
16
,],
[
-
19.888828882
,
-
101
]],
shape
=
[
6
,
2
])
logits
=
tf
.
concat
([
logits_positives
,
logits_negatives
],
0
)
priors
=
tf
.
constant
(
0.5
,
shape
=
[
2
])
def
builder
():
return
targets
,
logits
,
priors
return
builder
if
__name__
==
'__main__'
:
tf
.
test
.
main
()
research/global_objectives/test_all.py
0 → 100644
View file @
6c6f3f3a
# Copyright 2018 The TensorFlow Global Objectives Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Runs all unit tests in the Global Objectives package.
Requires that TensorFlow and abseil (https://github.com/abseil/abseil-py) be
installed on your machine. Command to run the tests:
python test_all.py
"""
import
os
import
sys
import
unittest
this_file
=
os
.
path
.
realpath
(
__file__
)
start_dir
=
os
.
path
.
dirname
(
this_file
)
parent_dir
=
os
.
path
.
dirname
(
start_dir
)
sys
.
path
.
append
(
parent_dir
)
loader
=
unittest
.
TestLoader
()
suite
=
loader
.
discover
(
start_dir
,
pattern
=
'*_test.py'
)
runner
=
unittest
.
TextTestRunner
(
verbosity
=
2
)
runner
.
run
(
suite
)
research/global_objectives/util.py
0 → 100644
View file @
6c6f3f3a
# Copyright 2018 The TensorFlow Global Objectives Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains utility functions for the global objectives library."""
# Dependency imports
import
tensorflow
as
tf
def
weighted_sigmoid_cross_entropy_with_logits
(
labels
,
logits
,
positive_weights
=
1.0
,
negative_weights
=
1.0
,
name
=
None
):
"""Computes a weighting of sigmoid cross entropy given `logits`.
Measures the weighted probability error in discrete classification tasks in
which classes are independent and not mutually exclusive. For instance, one
could perform multilabel classification where a picture can contain both an
elephant and a dog at the same time. The class weight multiplies the
different types of errors.
For brevity, let `x = logits`, `z = labels`, `c = positive_weights`,
`d = negative_weights` The
weighed logistic loss is
```
c * z * -log(sigmoid(x)) + d * (1 - z) * -log(1 - sigmoid(x))
= c * z * -log(1 / (1 + exp(-x))) - d * (1 - z) * log(exp(-x) / (1 + exp(-x)))
= c * z * log(1 + exp(-x)) + d * (1 - z) * (-log(exp(-x)) + log(1 + exp(-x)))
= c * z * log(1 + exp(-x)) + d * (1 - z) * (x + log(1 + exp(-x)))
= (1 - z) * x * d + (1 - z + c * z ) * log(1 + exp(-x))
= - d * x * z + d * x + (d - d * z + c * z ) * log(1 + exp(-x))
```
To ensure stability and avoid overflow, the implementation uses the identity
log(1 + exp(-x)) = max(0,-x) + log(1 + exp(-abs(x)))
and the result is computed as
```
= -d * x * z + d * x
+ (d - d * z + c * z ) * (max(0,-x) + log(1 + exp(-abs(x))))
```
Note that the loss is NOT an upper bound on the 0-1 loss, unless it is divided
by log(2).
Args:
labels: A `Tensor` of type `float32` or `float64`. `labels` can be a 2D
tensor with shape [batch_size, num_labels] or a 3D tensor with shape
[batch_size, num_labels, K].
logits: A `Tensor` of the same type and shape as `labels`. If `logits` has
shape [batch_size, num_labels, K], the loss is computed separately on each
slice [:, :, k] of `logits`.
positive_weights: A `Tensor` that holds positive weights and has the
following semantics according to its shape:
scalar - A global positive weight.
1D tensor - must be of size K, a weight for each 'attempt'
2D tensor - of size [num_labels, K'] where K' is either K or 1.
The `positive_weights` will be expanded to the left to match the
dimensions of logits and labels.
negative_weights: A `Tensor` that holds positive weight and has the
semantics identical to positive_weights.
name: A name for the operation (optional).
Returns:
A `Tensor` of the same shape as `logits` with the componentwise
weighted logistic losses.
"""
with
tf
.
name_scope
(
name
,
'weighted_logistic_loss'
,
[
logits
,
labels
,
positive_weights
,
negative_weights
])
as
name
:
labels
,
logits
,
positive_weights
,
negative_weights
=
prepare_loss_args
(
labels
,
logits
,
positive_weights
,
negative_weights
)
softplus_term
=
tf
.
add
(
tf
.
maximum
(
-
logits
,
0.0
),
tf
.
log
(
1.0
+
tf
.
exp
(
-
tf
.
abs
(
logits
))))
weight_dependent_factor
=
(
negative_weights
+
(
positive_weights
-
negative_weights
)
*
labels
)
return
(
negative_weights
*
(
logits
-
labels
*
logits
)
+
weight_dependent_factor
*
softplus_term
)
def
weighted_hinge_loss
(
labels
,
logits
,
positive_weights
=
1.0
,
negative_weights
=
1.0
,
name
=
None
):
"""Computes weighted hinge loss given logits `logits`.
The loss applies to multi-label classification tasks where labels are
independent and not mutually exclusive. See also
`weighted_sigmoid_cross_entropy_with_logits`.
Args:
labels: A `Tensor` of type `float32` or `float64`. Each entry must be
either 0 or 1. `labels` can be a 2D tensor with shape
[batch_size, num_labels] or a 3D tensor with shape
[batch_size, num_labels, K].
logits: A `Tensor` of the same type and shape as `labels`. If `logits` has
shape [batch_size, num_labels, K], the loss is computed separately on each
slice [:, :, k] of `logits`.
positive_weights: A `Tensor` that holds positive weights and has the
following semantics according to its shape:
scalar - A global positive weight.
1D tensor - must be of size K, a weight for each 'attempt'
2D tensor - of size [num_labels, K'] where K' is either K or 1.
The `positive_weights` will be expanded to the left to match the
dimensions of logits and labels.
negative_weights: A `Tensor` that holds positive weight and has the
semantics identical to positive_weights.
name: A name for the operation (optional).
Returns:
A `Tensor` of the same shape as `logits` with the componentwise
weighted hinge loss.
"""
with
tf
.
name_scope
(
name
,
'weighted_hinge_loss'
,
[
logits
,
labels
,
positive_weights
,
negative_weights
])
as
name
:
labels
,
logits
,
positive_weights
,
negative_weights
=
prepare_loss_args
(
labels
,
logits
,
positive_weights
,
negative_weights
)
positives_term
=
positive_weights
*
labels
*
tf
.
maximum
(
1.0
-
logits
,
0
)
negatives_term
=
(
negative_weights
*
(
1.0
-
labels
)
*
tf
.
maximum
(
1.0
+
logits
,
0
))
return
positives_term
+
negatives_term
def
weighted_surrogate_loss
(
labels
,
logits
,
surrogate_type
=
'xent'
,
positive_weights
=
1.0
,
negative_weights
=
1.0
,
name
=
None
):
"""Returns either weighted cross-entropy or hinge loss.
For example `surrogate_type` is 'xent' returns the weighted cross
entropy loss.
Args:
labels: A `Tensor` of type `float32` or `float64`. Each entry must be
between 0 and 1. `labels` can be a 2D tensor with shape
[batch_size, num_labels] or a 3D tensor with shape
[batch_size, num_labels, K].
logits: A `Tensor` of the same type and shape as `labels`. If `logits` has
shape [batch_size, num_labels, K], each slice [:, :, k] represents an
'attempt' to predict `labels` and the loss is computed per slice.
surrogate_type: A string that determines which loss to return, supports
'xent' for cross-entropy and 'hinge' for hinge loss.
positive_weights: A `Tensor` that holds positive weights and has the
following semantics according to its shape:
scalar - A global positive weight.
1D tensor - must be of size K, a weight for each 'attempt'
2D tensor - of size [num_labels, K'] where K' is either K or 1.
The `positive_weights` will be expanded to the left to match the
dimensions of logits and labels.
negative_weights: A `Tensor` that holds positive weight and has the
semantics identical to positive_weights.
name: A name for the operation (optional).
Returns:
The weigthed loss.
Raises:
ValueError: If value of `surrogate_type` is not supported.
"""
with
tf
.
name_scope
(
name
,
'weighted_loss'
,
[
logits
,
labels
,
surrogate_type
,
positive_weights
,
negative_weights
])
as
name
:
if
surrogate_type
==
'xent'
:
return
weighted_sigmoid_cross_entropy_with_logits
(
logits
=
logits
,
labels
=
labels
,
positive_weights
=
positive_weights
,
negative_weights
=
negative_weights
,
name
=
name
)
elif
surrogate_type
==
'hinge'
:
return
weighted_hinge_loss
(
logits
=
logits
,
labels
=
labels
,
positive_weights
=
positive_weights
,
negative_weights
=
negative_weights
,
name
=
name
)
raise
ValueError
(
'surrogate_type %s not supported.'
%
surrogate_type
)
def
expand_outer
(
tensor
,
rank
):
"""Expands the given `Tensor` outwards to a target rank.
For example if rank = 3 and tensor.shape is [3, 4], this function will expand
to such that the resulting shape will be [1, 3, 4].
Args:
tensor: The tensor to expand.
rank: The target dimension.
Returns:
The expanded tensor.
Raises:
ValueError: If rank of `tensor` is unknown, or if `rank` is smaller than
the rank of `tensor`.
"""
if
tensor
.
get_shape
().
ndims
is
None
:
raise
ValueError
(
'tensor dimension must be known.'
)
if
len
(
tensor
.
get_shape
())
>
rank
:
raise
ValueError
(
'`rank` must be at least the current tensor dimension: (%s vs %s).'
%
(
rank
,
len
(
tensor
.
get_shape
())))
while
len
(
tensor
.
get_shape
())
<
rank
:
tensor
=
tf
.
expand_dims
(
tensor
,
0
)
return
tensor
def
build_label_priors
(
labels
,
weights
=
None
,
positive_pseudocount
=
1.0
,
negative_pseudocount
=
1.0
,
variables_collections
=
None
):
"""Creates an op to maintain and update label prior probabilities.
For each label, the label priors are estimated as
(P + sum_i w_i y_i) / (P + N + sum_i w_i),
where y_i is the ith label, w_i is the ith weight, P is a pseudo-count of
positive labels, and N is a pseudo-count of negative labels. The index i
ranges over all labels observed during all evaluations of the returned op.
Args:
labels: A `Tensor` with shape [batch_size, num_labels]. Entries should be
in [0, 1].
weights: Coefficients representing the weight of each label. Must be either
a Tensor of shape [batch_size, num_labels] or `None`, in which case each
weight is treated as 1.0.
positive_pseudocount: Number of positive labels used to initialize the label
priors.
negative_pseudocount: Number of negative labels used to initialize the label
priors.
variables_collections: Optional list of collections for created variables.
Returns:
label_priors: An op to update the weighted label_priors. Gives the
current value of the label priors when evaluated.
"""
dtype
=
labels
.
dtype
.
base_dtype
num_labels
=
get_num_labels
(
labels
)
if
weights
is
None
:
weights
=
tf
.
ones_like
(
labels
)
# We disable partitioning while constructing dual variables because they will
# be updated with assign, which is not available for partitioned variables.
partitioner
=
tf
.
get_variable_scope
().
partitioner
try
:
tf
.
get_variable_scope
().
set_partitioner
(
None
)
# Create variable and update op for weighted label counts.
weighted_label_counts
=
tf
.
contrib
.
framework
.
model_variable
(
name
=
'weighted_label_counts'
,
shape
=
[
num_labels
],
dtype
=
dtype
,
initializer
=
tf
.
constant_initializer
(
[
positive_pseudocount
]
*
num_labels
,
dtype
=
dtype
),
collections
=
variables_collections
,
trainable
=
False
)
weighted_label_counts_update
=
weighted_label_counts
.
assign_add
(
tf
.
reduce_sum
(
weights
*
labels
,
0
))
# Create variable and update op for the sum of the weights.
weight_sum
=
tf
.
contrib
.
framework
.
model_variable
(
name
=
'weight_sum'
,
shape
=
[
num_labels
],
dtype
=
dtype
,
initializer
=
tf
.
constant_initializer
(
[
positive_pseudocount
+
negative_pseudocount
]
*
num_labels
,
dtype
=
dtype
),
collections
=
variables_collections
,
trainable
=
False
)
weight_sum_update
=
weight_sum
.
assign_add
(
tf
.
reduce_sum
(
weights
,
0
))
finally
:
tf
.
get_variable_scope
().
set_partitioner
(
partitioner
)
label_priors
=
tf
.
div
(
weighted_label_counts_update
,
weight_sum_update
)
return
label_priors
def
convert_and_cast
(
value
,
name
,
dtype
):
"""Convert input to tensor and cast to dtype.
Args:
value: An object whose type has a registered Tensor conversion function,
e.g. python numerical type or numpy array.
name: Name to use for the new Tensor, if one is created.
dtype: Optional element type for the returned tensor.
Returns:
A tensor.
"""
return
tf
.
cast
(
tf
.
convert_to_tensor
(
value
,
name
=
name
),
dtype
=
dtype
)
def
prepare_loss_args
(
labels
,
logits
,
positive_weights
,
negative_weights
):
"""Prepare arguments for weighted loss functions.
If needed, will convert given arguments to appropriate type and shape.
Args:
labels: labels or labels of the loss function.
logits: Logits of the loss function.
positive_weights: Weight on the positive examples.
negative_weights: Weight on the negative examples.
Returns:
Converted labels, logits, positive_weights, negative_weights.
"""
logits
=
tf
.
convert_to_tensor
(
logits
,
name
=
'logits'
)
labels
=
convert_and_cast
(
labels
,
'labels'
,
logits
.
dtype
)
if
len
(
labels
.
get_shape
())
==
2
and
len
(
logits
.
get_shape
())
==
3
:
labels
=
tf
.
expand_dims
(
labels
,
[
2
])
positive_weights
=
convert_and_cast
(
positive_weights
,
'positive_weights'
,
logits
.
dtype
)
positive_weights
=
expand_outer
(
positive_weights
,
logits
.
get_shape
().
ndims
)
negative_weights
=
convert_and_cast
(
negative_weights
,
'negative_weights'
,
logits
.
dtype
)
negative_weights
=
expand_outer
(
negative_weights
,
logits
.
get_shape
().
ndims
)
return
labels
,
logits
,
positive_weights
,
negative_weights
def
get_num_labels
(
labels_or_logits
):
"""Returns the number of labels inferred from labels_or_logits."""
if
labels_or_logits
.
get_shape
().
ndims
<=
1
:
return
1
return
labels_or_logits
.
get_shape
()[
1
].
value
research/global_objectives/util_test.py
0 → 100644
View file @
6c6f3f3a
# Copyright 2018 The TensorFlow Global Objectives Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for global objectives util functions."""
# Dependency imports
from
absl.testing
import
parameterized
import
numpy
as
np
import
tensorflow
as
tf
from
global_objectives
import
util
def
weighted_sigmoid_cross_entropy
(
targets
,
logits
,
weight
):
return
(
weight
*
targets
*
np
.
log
(
1.0
+
np
.
exp
(
-
logits
))
+
(
(
1.0
-
targets
)
*
np
.
log
(
1.0
+
1.0
/
np
.
exp
(
-
logits
))))
def
hinge_loss
(
labels
,
logits
):
# Mostly copied from tensorflow.python.ops.losses but with loss per datapoint.
labels
=
tf
.
to_float
(
labels
)
all_ones
=
tf
.
ones_like
(
labels
)
labels
=
tf
.
subtract
(
2
*
labels
,
all_ones
)
return
tf
.
nn
.
relu
(
tf
.
subtract
(
all_ones
,
tf
.
multiply
(
labels
,
logits
)))
class
WeightedSigmoidCrossEntropyTest
(
parameterized
.
TestCase
,
tf
.
test
.
TestCase
):
def
testTrivialCompatibilityWithSigmoidCrossEntropy
(
self
):
"""Tests compatibility with unweighted function with weight 1.0."""
x_shape
=
[
300
,
10
]
targets
=
np
.
random
.
random_sample
(
x_shape
).
astype
(
np
.
float32
)
logits
=
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
)
weighted_loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
)
expected_loss
=
(
tf
.
contrib
.
nn
.
deprecated_flipped_sigmoid_cross_entropy_with_logits
(
logits
,
targets
))
with
self
.
test_session
():
self
.
assertAllClose
(
expected_loss
.
eval
(),
weighted_loss
.
eval
(),
atol
=
0.000001
)
def
testNonTrivialCompatibilityWithSigmoidCrossEntropy
(
self
):
"""Tests use of an arbitrary weight (4.12)."""
x_shape
=
[
300
,
10
]
targets
=
np
.
random
.
random_sample
(
x_shape
).
astype
(
np
.
float32
)
logits
=
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
)
weight
=
4.12
weighted_loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
weight
,
weight
)
expected_loss
=
(
weight
*
tf
.
contrib
.
nn
.
deprecated_flipped_sigmoid_cross_entropy_with_logits
(
logits
,
targets
))
with
self
.
test_session
():
self
.
assertAllClose
(
expected_loss
.
eval
(),
weighted_loss
.
eval
(),
atol
=
0.000001
)
def
testDifferentSizeWeightedSigmoidCrossEntropy
(
self
):
"""Tests correctness on 3D tensors.
Tests that the function works as expected when logits is a 3D tensor and
targets is a 2D tensor.
"""
targets_shape
=
[
30
,
4
]
logits_shape
=
[
targets_shape
[
0
],
targets_shape
[
1
],
3
]
targets
=
np
.
random
.
random_sample
(
targets_shape
).
astype
(
np
.
float32
)
logits
=
np
.
random
.
randn
(
*
logits_shape
).
astype
(
np
.
float32
)
weight_vector
=
[
2.0
,
3.0
,
13.0
]
loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
weight_vector
)
with
self
.
test_session
():
loss
=
loss
.
eval
()
for
i
in
range
(
0
,
len
(
weight_vector
)):
expected
=
weighted_sigmoid_cross_entropy
(
targets
,
logits
[:,
:,
i
],
weight_vector
[
i
])
self
.
assertAllClose
(
loss
[:,
:,
i
],
expected
,
atol
=
0.000001
)
@
parameterized
.
parameters
((
300
,
10
,
0.3
),
(
20
,
4
,
2.0
),
(
30
,
4
,
3.9
))
def
testWeightedSigmoidCrossEntropy
(
self
,
batch_size
,
num_labels
,
weight
):
"""Tests thats the tf and numpy functions agree on many instances."""
x_shape
=
[
batch_size
,
num_labels
]
targets
=
np
.
random
.
random_sample
(
x_shape
).
astype
(
np
.
float32
)
logits
=
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
)
with
self
.
test_session
():
loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
weight
,
1.0
,
name
=
'weighted-loss'
)
expected
=
weighted_sigmoid_cross_entropy
(
targets
,
logits
,
weight
)
self
.
assertAllClose
(
expected
,
loss
.
eval
(),
atol
=
0.000001
)
def
testGradients
(
self
):
"""Tests that weighted loss gradients behave as expected."""
dummy_tensor
=
tf
.
constant
(
1.0
)
positives_shape
=
[
10
,
1
]
positives_logits
=
dummy_tensor
*
tf
.
Variable
(
tf
.
random_normal
(
positives_shape
)
+
1.0
)
positives_targets
=
tf
.
ones
(
positives_shape
)
positives_weight
=
4.6
positives_loss
=
(
tf
.
contrib
.
nn
.
deprecated_flipped_sigmoid_cross_entropy_with_logits
(
positives_logits
,
positives_targets
)
*
positives_weight
)
negatives_shape
=
[
190
,
1
]
negatives_logits
=
dummy_tensor
*
tf
.
Variable
(
tf
.
random_normal
(
negatives_shape
))
negatives_targets
=
tf
.
zeros
(
negatives_shape
)
negatives_weight
=
0.9
negatives_loss
=
(
tf
.
contrib
.
nn
.
deprecated_flipped_sigmoid_cross_entropy_with_logits
(
negatives_logits
,
negatives_targets
)
*
negatives_weight
)
all_logits
=
tf
.
concat
([
positives_logits
,
negatives_logits
],
0
)
all_targets
=
tf
.
concat
([
positives_targets
,
negatives_targets
],
0
)
weighted_loss
=
tf
.
reduce_sum
(
util
.
weighted_sigmoid_cross_entropy_with_logits
(
all_targets
,
all_logits
,
positives_weight
,
negatives_weight
))
weighted_gradients
=
tf
.
gradients
(
weighted_loss
,
dummy_tensor
)
expected_loss
=
tf
.
add
(
tf
.
reduce_sum
(
positives_loss
),
tf
.
reduce_sum
(
negatives_loss
))
expected_gradients
=
tf
.
gradients
(
expected_loss
,
dummy_tensor
)
with
tf
.
Session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
grad
,
expected_grad
=
session
.
run
(
[
weighted_gradients
,
expected_gradients
])
self
.
assertAllClose
(
grad
,
expected_grad
)
def
testDtypeFlexibility
(
self
):
"""Tests the loss on inputs of varying data types."""
shape
=
[
20
,
3
]
logits
=
np
.
random
.
randn
(
*
shape
)
targets
=
tf
.
truncated_normal
(
shape
)
positive_weights
=
tf
.
constant
(
3
,
dtype
=
tf
.
int64
)
negative_weights
=
1
loss
=
util
.
weighted_sigmoid_cross_entropy_with_logits
(
targets
,
logits
,
positive_weights
,
negative_weights
)
with
self
.
test_session
():
self
.
assertEqual
(
loss
.
eval
().
dtype
,
np
.
float
)
class
WeightedHingeLossTest
(
tf
.
test
.
TestCase
):
def
testTrivialCompatibilityWithHinge
(
self
):
# Tests compatibility with unweighted hinge loss.
x_shape
=
[
55
,
10
]
logits
=
tf
.
constant
(
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
))
targets
=
tf
.
to_float
(
tf
.
constant
(
np
.
random
.
random_sample
(
x_shape
)
>
0.3
))
weighted_loss
=
util
.
weighted_hinge_loss
(
targets
,
logits
)
expected_loss
=
hinge_loss
(
targets
,
logits
)
with
self
.
test_session
():
self
.
assertAllClose
(
expected_loss
.
eval
(),
weighted_loss
.
eval
())
def
testLessTrivialCompatibilityWithHinge
(
self
):
# Tests compatibility with a constant weight for positives and negatives.
x_shape
=
[
56
,
11
]
logits
=
tf
.
constant
(
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
))
targets
=
tf
.
to_float
(
tf
.
constant
(
np
.
random
.
random_sample
(
x_shape
)
>
0.7
))
weight
=
1.0
+
1.0
/
2
+
1.0
/
3
+
1.0
/
4
+
1.0
/
5
+
1.0
/
6
+
1.0
/
7
weighted_loss
=
util
.
weighted_hinge_loss
(
targets
,
logits
,
weight
,
weight
)
expected_loss
=
hinge_loss
(
targets
,
logits
)
*
weight
with
self
.
test_session
():
self
.
assertAllClose
(
expected_loss
.
eval
(),
weighted_loss
.
eval
())
def
testNontrivialCompatibilityWithHinge
(
self
):
# Tests compatibility with different positive and negative weights.
x_shape
=
[
23
,
8
]
logits_positives
=
tf
.
constant
(
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
))
logits_negatives
=
tf
.
constant
(
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
))
targets_positives
=
tf
.
ones
(
x_shape
)
targets_negatives
=
tf
.
zeros
(
x_shape
)
logits
=
tf
.
concat
([
logits_positives
,
logits_negatives
],
0
)
targets
=
tf
.
concat
([
targets_positives
,
targets_negatives
],
0
)
raw_loss
=
util
.
weighted_hinge_loss
(
targets
,
logits
,
positive_weights
=
3.4
,
negative_weights
=
1.2
)
loss
=
tf
.
reduce_sum
(
raw_loss
,
0
)
positives_hinge
=
hinge_loss
(
targets_positives
,
logits_positives
)
negatives_hinge
=
hinge_loss
(
targets_negatives
,
logits_negatives
)
expected
=
tf
.
add
(
tf
.
reduce_sum
(
3.4
*
positives_hinge
,
0
),
tf
.
reduce_sum
(
1.2
*
negatives_hinge
,
0
))
with
self
.
test_session
():
self
.
assertAllClose
(
loss
.
eval
(),
expected
.
eval
())
def
test3DLogitsAndTargets
(
self
):
# Tests correctness when logits is 3D and targets is 2D.
targets_shape
=
[
30
,
4
]
logits_shape
=
[
targets_shape
[
0
],
targets_shape
[
1
],
3
]
targets
=
tf
.
to_float
(
tf
.
constant
(
np
.
random
.
random_sample
(
targets_shape
)
>
0.7
))
logits
=
tf
.
constant
(
np
.
random
.
randn
(
*
logits_shape
).
astype
(
np
.
float32
))
weight_vector
=
[
1.0
,
1.0
,
1.0
]
loss
=
util
.
weighted_hinge_loss
(
targets
,
logits
,
weight_vector
)
with
self
.
test_session
():
loss_value
=
loss
.
eval
()
for
i
in
range
(
len
(
weight_vector
)):
expected
=
hinge_loss
(
targets
,
logits
[:,
:,
i
]).
eval
()
self
.
assertAllClose
(
loss_value
[:,
:,
i
],
expected
)
class
BuildLabelPriorsTest
(
tf
.
test
.
TestCase
):
def
testLabelPriorConsistency
(
self
):
# Checks that, with zero pseudocounts, the returned label priors reproduce
# label frequencies in the batch.
batch_shape
=
[
4
,
10
]
labels
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.678
)))
label_priors_update
=
util
.
build_label_priors
(
labels
=
labels
,
positive_pseudocount
=
0
,
negative_pseudocount
=
0
)
expected_priors
=
tf
.
reduce_mean
(
labels
,
0
)
with
self
.
test_session
():
tf
.
global_variables_initializer
().
run
()
self
.
assertAllClose
(
label_priors_update
.
eval
(),
expected_priors
.
eval
())
def
testLabelPriorsUpdate
(
self
):
# Checks that the update of label priors behaves as expected.
batch_shape
=
[
1
,
5
]
labels
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.4
)))
label_priors_update
=
util
.
build_label_priors
(
labels
)
label_sum
=
np
.
ones
(
shape
=
batch_shape
)
weight_sum
=
2.0
*
np
.
ones
(
shape
=
batch_shape
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
for
_
in
range
(
3
):
label_sum
+=
labels
.
eval
()
weight_sum
+=
np
.
ones
(
shape
=
batch_shape
)
expected_posteriors
=
label_sum
/
weight_sum
label_priors
=
label_priors_update
.
eval
().
reshape
(
batch_shape
)
self
.
assertAllClose
(
label_priors
,
expected_posteriors
)
# Re-initialize labels to get a new random sample.
session
.
run
(
labels
.
initializer
)
def
testLabelPriorsUpdateWithWeights
(
self
):
# Checks the update of label priors with per-example weights.
batch_size
=
6
num_labels
=
5
batch_shape
=
[
batch_size
,
num_labels
]
labels
=
tf
.
Variable
(
tf
.
to_float
(
tf
.
greater
(
tf
.
random_uniform
(
batch_shape
),
0.6
)))
weights
=
tf
.
Variable
(
tf
.
random_uniform
(
batch_shape
)
*
6.2
)
update_op
=
util
.
build_label_priors
(
labels
,
weights
=
weights
)
expected_weighted_label_counts
=
1.0
+
tf
.
reduce_sum
(
weights
*
labels
,
0
)
expected_weight_sum
=
2.0
+
tf
.
reduce_sum
(
weights
,
0
)
expected_label_posteriors
=
tf
.
divide
(
expected_weighted_label_counts
,
expected_weight_sum
)
with
self
.
test_session
()
as
session
:
tf
.
global_variables_initializer
().
run
()
updated_priors
,
expected_posteriors
=
session
.
run
(
[
update_op
,
expected_label_posteriors
])
self
.
assertAllClose
(
updated_priors
,
expected_posteriors
)
class
WeightedSurrogateLossTest
(
parameterized
.
TestCase
,
tf
.
test
.
TestCase
):
@
parameterized
.
parameters
(
(
'hinge'
,
util
.
weighted_hinge_loss
),
(
'xent'
,
util
.
weighted_sigmoid_cross_entropy_with_logits
))
def
testCompatibilityLoss
(
self
,
loss_name
,
loss_fn
):
x_shape
=
[
28
,
4
]
logits
=
tf
.
constant
(
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
))
targets
=
tf
.
to_float
(
tf
.
constant
(
np
.
random
.
random_sample
(
x_shape
)
>
0.5
))
positive_weights
=
0.66
negative_weights
=
11.1
expected_loss
=
loss_fn
(
targets
,
logits
,
positive_weights
=
positive_weights
,
negative_weights
=
negative_weights
)
computed_loss
=
util
.
weighted_surrogate_loss
(
targets
,
logits
,
loss_name
,
positive_weights
=
positive_weights
,
negative_weights
=
negative_weights
)
with
self
.
test_session
():
self
.
assertAllClose
(
expected_loss
.
eval
(),
computed_loss
.
eval
())
def
testSurrogatgeError
(
self
):
x_shape
=
[
7
,
3
]
logits
=
tf
.
constant
(
np
.
random
.
randn
(
*
x_shape
).
astype
(
np
.
float32
))
targets
=
tf
.
to_float
(
tf
.
constant
(
np
.
random
.
random_sample
(
x_shape
)
>
0.5
))
with
self
.
assertRaises
(
ValueError
):
util
.
weighted_surrogate_loss
(
logits
,
targets
,
'bug'
)
if
__name__
==
'__main__'
:
tf
.
test
.
main
()
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment