"vscode:/vscode.git/clone" did not exist on "535eb011a97910edf083b9b29f5749d86d624b4b"
Commit 9062d200 authored by André Araujo's avatar André Araujo Committed by aquariusjay
Browse files

Code for Detect-to-Retrieve fully integrated (#6829)

* Initial feature aggregation code for Detect-to-Retrieve paper.

PiperOrigin-RevId: 246043144

* Add support for ASMK/ASMK*/R-ASMK/R-ASMK*.

PiperOrigin-RevId: 247337028

* Add DatumProto uint32 field, and limit datum_io to uint32 and float32/float64 types.

Also, introduce DatumPairProto, to be used for ASMK variants. Functions to read/write in this new format are added and tested.

PiperOrigin-RevId: 247515205

* Add batching option to feature aggregation extraction.

PiperOrigin-RevId: 247614627

* Script to perform local feature aggregation, with associated configs.

Also small edits to the aggregation extractor, for better handling of input features / avoiding OOM.

PiperOrigin-RevId: 248150750

* Tests to check that aggregation using regions with no local features works.

PiperOrigin-RevId: 248153275

* Include new library/proto for aggregation

* Merged commit includes the following changes:

PiperOrigin-RevId: 248176511

* Merged commit includes the following changes:
248194572  by Andre Araujo:

    Change tf.tensor_scatter_nd_add --> tf.compat.v1.tensor_scatter_add to make it compatible with TF 1.X.

--

PiperOrigin-RevId: 248194572

* Functions to parse ground-truth and compute metrics for revisited datasets.

Unit tests are added.

PiperOrigin-RevId: 248561575

* Small change to argparse bool option, which does not work as expected.

PiperOrigin-RevId: 248805505

* Class to compute similarity between aggregated descriptors.

PiperOrigin-RevId: 249102986

* Script to perform retrieval and compute metrics.

PiperOrigin-RevId: 249104011

* feature_aggregation_similarity library in DELF init

* D2R instructions / README update

* Small edit to README

* Internal change.

PiperOrigin-RevId: 249113531

* Instructions to reproduce D2R paper results, and small edits to config files.

PiperOrigin-RevId: 249159850
parent 7ac267a8
......@@ -94,6 +94,11 @@ Presented in the
Presented in the
[CVPR'19 Detect-to-Retrieve paper](https://arxiv.org/abs/1812.01584).
Besides these, we also release pre-trained codebooks for local feature
aggregation. See the
[Detect-to-Retrieve instructions](delf/python/detect_to_retrieve/DETECT_TO_RETRIEVE_INSTRUCTIONS.md)
for details.
### DELF extraction and matching
Please follow [these instructions](EXTRACTION_MATCHING.md). At the end, you
......@@ -110,7 +115,10 @@ a nice figure showing a detection, as:
### Detect-to-Retrieve
Code release is in progress. Stay tuned!
Please follow
[these instructions](delf/python/detect_to_retrieve/DETECT_TO_RETRIEVE_INSTRUCTIONS.md).
At the end, you should obtain image retrieval results on the Revisited
Oxford/Paris datasets.
## Code overview
......@@ -121,6 +129,8 @@ therein, `protos` and `python`.
This directory contains protobufs:
- `aggregation_config.proto`: protobuf for configuring local feature
aggregation.
- `box.proto`: protobuf for serializing detected boxes.
- `datum.proto`: general-purpose protobuf for serializing float tensors.
- `delf_config.proto`: protobuf for configuring DELF extraction.
......@@ -133,14 +143,15 @@ This directory contains files for several different purposes:
- `box_io.py`, `datum_io.py`, `feature_io.py` are helper files for reading and
writing tensors and features.
- `delf_v1.py` contains the code to create DELF models.
- `feature_aggregation_extractor.py` contains a module to perform local
feature aggregation.
- `feature_aggregation_similarity.py` contains a module to perform similarity
computation for aggregated local features.
- `feature_extractor.py` contains the code to extract features using DELF.
This is particularly useful for extracting features over multiple scales,
with keypoint selection based on attention scores, and PCA/whitening
post-processing.
Besides these, other files in this directory contain tests for different
modules.
The subdirectory `delf/python/examples` contains sample scripts to run DELF
feature extraction/matching, and object detection:
......@@ -151,8 +162,26 @@ feature extraction/matching, and object detection:
- `match_images.py` supports image matching using DELF features extracted
using `extract_features.py`.
The subdirectory `delf/python/detect_to_retrieve` contains sample scripts
related to the Detect-to-Retrieve paper (work in progress).
The subdirectory `delf/python/detect_to_retrieve` contains sample
scripts/configs related to the Detect-to-Retrieve paper:
- `cluster_delf_features.py` for local feature clustering.
- `dataset.py` for parsing/evaluating results on Revisited Oxford/Paris
datasets.
- `extract_aggregation.py` for aggregated local feature extraction.
- `extract_index_boxes_and_features.py` for index image local feature
extraction / bounding box detection on Revisited datasets.
- `extract_query_features.py` for query image local feature extraction on
Revisited datasets.
- `perform_retrieval.py` for performing retrieval/evaluating methods using
aggregated local features on Revisited datasets.
- `delf_gld_config.pbtxt` gives the DelfConfig used in Detect-to-Retrieve
paper.
- `index_aggregation_config.pbtxt`, `query_aggregation_config.pbtxt` give
AggregationConfig's for Detect-to-Retrieve experiments.
Besides these, other files in the different subdirectories contain tests for the
various modules.
## Maintainers
......@@ -162,7 +191,7 @@ André Araujo (@andrefaraujo)
### April, 2019
Detect-to-Retrieve code released (work in progress).
Detect-to-Retrieve code released.
Includes pre-trained models to detect landmark boxes, and DELF model pre-trained
on Google Landmarks v1 dataset.
......
......@@ -28,6 +28,7 @@ from delf.python import datum_io
from delf.python import delf_v1
from delf.python import detect_to_retrieve
from delf.python import feature_aggregation_extractor
from delf.python import feature_aggregation_similarity
from delf.python import feature_extractor
from delf.python import feature_io
from delf.python.examples import extract_boxes
......
# Copyright 2017 The TensorFlow Authors All Rights Reserved.
# Copyright 2019 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
......@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Python interface for Revisited Oxford/Paris dataset."""
"""Python library to parse ground-truth/evaluate on Revisited datasets."""
from __future__ import absolute_import
from __future__ import division
......@@ -22,7 +22,7 @@ import numpy as np
from scipy.io import matlab
import tensorflow as tf
_GROUND_TRUTH_KEYS = ['easy', 'hard', 'junk', 'ok']
_GROUND_TRUTH_KEYS = ['easy', 'hard', 'junk']
def ReadDatasetFile(dataset_file_path):
......@@ -36,9 +36,9 @@ def ReadDatasetFile(dataset_file_path):
index_list: List of index image names.
ground_truth: List containing ground-truth information for dataset. Each
entry is a dict corresponding to the ground-truth information for a query.
The dict may have keys 'easy', 'hard', 'junk' or 'ok', mapping to a list
of integers; additionally, it has a key 'bbx' mapping to a list of floats
with bounding box coordinates.
The dict may have keys 'easy', 'hard', or 'junk', mapping to a NumPy
array of integers; additionally, it has a key 'bbx' mapping to a NumPy
array of floats with bounding box coordinates.
"""
with tf.gfile.GFile(dataset_file_path, 'r') as f:
cfg = matlab.loadmat(f)
......@@ -59,3 +59,258 @@ def ReadDatasetFile(dataset_file_path):
ground_truth.append(query_ground_truth)
return query_list, index_list, ground_truth
def _ParseGroundTruth(ok_list, junk_list):
"""Constructs dictionary of ok/junk indices for a data subset and query.
Args:
ok_list: List of NumPy arrays containing true positive indices for query.
junk_list: List of NumPy arrays containing ignored indices for query.
Returns:
ok_junk_dict: Dict mapping 'ok' and 'junk' strings to NumPy array of
indices.
"""
ok_junk_dict = {}
ok_junk_dict['ok'] = np.concatenate(ok_list)
ok_junk_dict['junk'] = np.concatenate(junk_list)
return ok_junk_dict
def ParseEasyMediumHardGroundTruth(ground_truth):
"""Parses easy/medium/hard ground-truth from Revisited datasets.
Args:
ground_truth: Usually the output from ReadDatasetFile(). List containing
ground-truth information for dataset. Each entry is a dict corresponding
to the ground-truth information for a query. The dict must have keys
'easy', 'hard', and 'junk', mapping to a NumPy array of integers.
Returns:
easy_ground_truth: List containing ground-truth information for easy subset
of dataset. Each entry is a dict corresponding to the ground-truth
information for a query. The dict has keys 'ok' and 'junk', mapping to a
NumPy array of integers.
medium_ground_truth: Same as `easy_ground_truth`, but for the medium subset.
hard_ground_truth: Same as `easy_ground_truth`, but for the hard subset.
"""
num_queries = len(ground_truth)
easy_ground_truth = []
medium_ground_truth = []
hard_ground_truth = []
for i in range(num_queries):
easy_ground_truth.append(
_ParseGroundTruth([ground_truth[i]['easy']],
[ground_truth[i]['junk'], ground_truth[i]['hard']]))
medium_ground_truth.append(
_ParseGroundTruth([ground_truth[i]['easy'], ground_truth[i]['hard']],
[ground_truth[i]['junk']]))
hard_ground_truth.append(
_ParseGroundTruth([ground_truth[i]['hard']],
[ground_truth[i]['junk'], ground_truth[i]['easy']]))
return easy_ground_truth, medium_ground_truth, hard_ground_truth
def AdjustPositiveRanks(positive_ranks, junk_ranks):
"""Adjusts positive ranks based on junk ranks.
Args:
positive_ranks: Sorted 1D NumPy integer array.
junk_ranks: Sorted 1D NumPy integer array.
Returns:
adjusted_positive_ranks: Sorted 1D NumPy array.
"""
if not junk_ranks.size:
return positive_ranks
adjusted_positive_ranks = positive_ranks
j = 0
for i, positive_index in enumerate(positive_ranks):
while (j < len(junk_ranks) and positive_index > junk_ranks[j]):
j += 1
adjusted_positive_ranks[i] -= j
return adjusted_positive_ranks
def ComputeAveragePrecision(positive_ranks):
"""Computes average precision according to dataset convention.
It assumes that `positive_ranks` contains the ranks for all expected positive
index images to be retrieved. If `positive_ranks` is empty, returns
`average_precision` = 0.
Note that average precision computation here does NOT use the finite sum
method (see
https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Average_precision)
which is common in information retrieval literature. Instead, the method
implemented here integrates over the precision-recall curve by averaging two
adjacent precision points, then multiplying by the recall step. This is the
convention for the Revisited Oxford/Paris datasets.
Args:
positive_ranks: Sorted 1D NumPy integer array, zero-indexed.
Returns:
average_precision: Float.
"""
average_precision = 0.0
num_expected_positives = len(positive_ranks)
if not num_expected_positives:
return average_precision
recall_step = 1.0 / num_expected_positives
for i, rank in enumerate(positive_ranks):
if not rank:
left_precision = 1.0
else:
left_precision = i / rank
right_precision = (i + 1) / (rank + 1)
average_precision += (left_precision + right_precision) * recall_step / 2
return average_precision
def ComputePRAtRanks(positive_ranks, desired_pr_ranks):
"""Computes precision/recall at desired ranks.
It assumes that `positive_ranks` contains the ranks for all expected positive
index images to be retrieved. If `positive_ranks` is empty, return all-zeros
`precisions`/`recalls`.
If a desired rank is larger than the last positive rank, its precision is
computed based on the last positive rank. For example, if `desired_pr_ranks`
is [10] and `positive_ranks` = [0, 7] --> `precisions` = [0.25], `recalls` =
[1.0].
Args:
positive_ranks: 1D NumPy integer array, zero-indexed.
desired_pr_ranks: List of integers containing the desired precision/recall
ranks to be reported. Eg, if precision@1/recall@1 and
precision@10/recall@10 are desired, this should be set to [1, 10].
Returns:
precisions: Precision @ `desired_pr_ranks` (NumPy array of
floats, with shape [len(desired_pr_ranks)]).
recalls: Recall @ `desired_pr_ranks` (NumPy array of floats, with
shape [len(desired_pr_ranks)]).
"""
num_desired_pr_ranks = len(desired_pr_ranks)
precisions = np.zeros([num_desired_pr_ranks])
recalls = np.zeros([num_desired_pr_ranks])
num_expected_positives = len(positive_ranks)
if not num_expected_positives:
return precisions, recalls
positive_ranks_one_indexed = positive_ranks + 1
for i, desired_pr_rank in enumerate(desired_pr_ranks):
recalls[i] = np.sum(
positive_ranks_one_indexed <= desired_pr_rank) / num_expected_positives
# If `desired_pr_rank` is larger than last positive's rank, only compute
# precision with respect to last positive's position.
precision_rank = min(max(positive_ranks_one_indexed), desired_pr_rank)
precisions[i] = np.sum(
positive_ranks_one_indexed <= precision_rank) / precision_rank
return precisions, recalls
def ComputeMetrics(sorted_index_ids, ground_truth, desired_pr_ranks):
"""Computes metrics for retrieval results on the Revisited datasets.
If there are no valid ground-truth index images for a given query, the metric
results for the given query (`average_precisions`, `precisions` and `recalls`)
are set to NaN, and they are not taken into account when computing the
aggregated metrics (`mean_average_precision`, `mean_precisions` and
`mean_recalls`) over all queries.
Args:
sorted_index_ids: Integer NumPy array of shape [#queries, #index_images].
For each query, contains an array denoting the most relevant index images,
sorted from most to least relevant.
ground_truth: List containing ground-truth information for dataset. Each
entry is a dict corresponding to the ground-truth information for a query.
The dict has keys 'ok' and 'junk', mapping to a NumPy array of integers.
desired_pr_ranks: List of integers containing the desired precision/recall
ranks to be reported. Eg, if precision@1/recall@1 and
precision@10/recall@10 are desired, this should be set to [1, 10]. The
largest item should be <= #index_images.
Returns:
mean_average_precision: Mean average precision (float).
mean_precisions: Mean precision @ `desired_pr_ranks` (NumPy array of
floats, with shape [len(desired_pr_ranks)]).
mean_recalls: Mean recall @ `desired_pr_ranks` (NumPy array of floats, with
shape [len(desired_pr_ranks)]).
average_precisions: Average precision for each query (NumPy array of floats,
with shape [#queries]).
precisions: Precision @ `desired_pr_ranks`, for each query (NumPy array of
floats, with shape [#queries, len(desired_pr_ranks)]).
recalls: Recall @ `desired_pr_ranks`, for each query (NumPy array of
floats, with shape [#queries, len(desired_pr_ranks)]).
Raises:
ValueError: If largest desired PR rank in `desired_pr_ranks` >
#index_images.
"""
num_queries, num_index_images = sorted_index_ids.shape
num_desired_pr_ranks = len(desired_pr_ranks)
sorted_desired_pr_ranks = sorted(desired_pr_ranks)
if sorted_desired_pr_ranks[-1] > num_index_images:
raise ValueError(
'Requested PR ranks up to %d, however there are only %d images' %
(sorted_desired_pr_ranks[-1], num_index_images))
# Instantiate all outputs, then loop over each query and gather metrics.
mean_average_precision = 0.0
mean_precisions = np.zeros([num_desired_pr_ranks])
mean_recalls = np.zeros([num_desired_pr_ranks])
average_precisions = np.zeros([num_queries])
precisions = np.zeros([num_queries, num_desired_pr_ranks])
recalls = np.zeros([num_queries, num_desired_pr_ranks])
num_empty_gt_queries = 0
for i in range(num_queries):
ok_index_images = ground_truth[i]['ok']
junk_index_images = ground_truth[i]['junk']
if not ok_index_images.size:
average_precisions[i] = float('nan')
precisions[i, :] = float('nan')
recalls[i, :] = float('nan')
num_empty_gt_queries += 1
continue
positive_ranks = np.arange(num_index_images)[np.in1d(
sorted_index_ids[i], ok_index_images)]
junk_ranks = np.arange(num_index_images)[np.in1d(sorted_index_ids[i],
junk_index_images)]
adjusted_positive_ranks = AdjustPositiveRanks(positive_ranks, junk_ranks)
average_precisions[i] = ComputeAveragePrecision(adjusted_positive_ranks)
precisions[i, :], recalls[i, :] = ComputePRAtRanks(adjusted_positive_ranks,
desired_pr_ranks)
mean_average_precision += average_precisions[i]
mean_precisions += precisions[i, :]
mean_recalls += recalls[i, :]
# Normalize aggregated metrics by number of queries.
num_valid_queries = num_queries - num_empty_gt_queries
mean_average_precision /= num_valid_queries
mean_precisions /= num_valid_queries
mean_recalls /= num_valid_queries
return (mean_average_precision, mean_precisions, mean_recalls,
average_precisions, precisions, recalls)
# Copyright 2019 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for the python library parsing Revisited Oxford/Paris datasets."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import tensorflow as tf
from delf.python.detect_to_retrieve import dataset
class DatasetTest(tf.test.TestCase):
def testParseEasyMediumHardGroundTruth(self):
# Define input.
ground_truth = [{
'easy': np.array([10, 56, 100]),
'hard': np.array([0]),
'junk': np.array([6, 90])
}, {
'easy': np.array([], dtype='int64'),
'hard': [5],
'junk': [99, 100]
}, {
'easy': [33],
'hard': [66, 99],
'junk': np.array([], dtype='int64')
}]
# Run tested function.
(easy_ground_truth, medium_ground_truth,
hard_ground_truth) = dataset.ParseEasyMediumHardGroundTruth(ground_truth)
# Define expected outputs.
expected_easy_ground_truth = [{
'ok': np.array([10, 56, 100]),
'junk': np.array([6, 90, 0])
}, {
'ok': np.array([], dtype='int64'),
'junk': np.array([99, 100, 5])
}, {
'ok': np.array([33]),
'junk': np.array([66, 99])
}]
expected_medium_ground_truth = [{
'ok': np.array([10, 56, 100, 0]),
'junk': np.array([6, 90])
}, {
'ok': np.array([5]),
'junk': np.array([99, 100])
}, {
'ok': np.array([33, 66, 99]),
'junk': np.array([], dtype='int64')
}]
expected_hard_ground_truth = [{
'ok': np.array([0]),
'junk': np.array([6, 90, 10, 56, 100])
}, {
'ok': np.array([5]),
'junk': np.array([99, 100])
}, {
'ok': np.array([66, 99]),
'junk': np.array([33])
}]
# Compare actual versus expected.
def _AssertListOfDictsOfArraysAreEqual(ground_truth, expected_ground_truth):
"""Helper function to compare ground-truth data.
Args:
ground_truth: List of dicts of arrays.
expected_ground_truth: List of dicts of arrays.
"""
self.assertEqual(len(ground_truth), len(expected_ground_truth))
for i, ground_truth_entry in enumerate(ground_truth):
self.assertEqual(sorted(ground_truth_entry.keys()), ['junk', 'ok'])
self.assertAllEqual(ground_truth_entry['junk'],
expected_ground_truth[i]['junk'])
self.assertAllEqual(ground_truth_entry['ok'],
expected_ground_truth[i]['ok'])
_AssertListOfDictsOfArraysAreEqual(easy_ground_truth,
expected_easy_ground_truth)
_AssertListOfDictsOfArraysAreEqual(medium_ground_truth,
expected_medium_ground_truth)
_AssertListOfDictsOfArraysAreEqual(hard_ground_truth,
expected_hard_ground_truth)
def testAdjustPositiveRanksWorks(self):
# Define inputs.
positive_ranks = np.array([0, 2, 6, 10, 20])
junk_ranks = np.array([1, 8, 9, 30])
# Run tested function.
adjusted_positive_ranks = dataset.AdjustPositiveRanks(
positive_ranks, junk_ranks)
# Define expected output.
expected_adjusted_positive_ranks = [0, 1, 5, 7, 17]
# Compare actual versus expected.
self.assertAllEqual(adjusted_positive_ranks,
expected_adjusted_positive_ranks)
def testComputeAveragePrecisionWorks(self):
# Define input.
positive_ranks = [0, 2, 5]
# Run tested function.
average_precision = dataset.ComputeAveragePrecision(positive_ranks)
# Define expected output.
expected_average_precision = 0.677778
# Compare actual versus expected.
self.assertAllClose(average_precision, expected_average_precision)
def testComputePRAtRanksWorks(self):
# Define inputs.
positive_ranks = np.array([0, 2, 5])
desired_pr_ranks = np.array([1, 5, 10])
# Run tested function.
precisions, recalls = dataset.ComputePRAtRanks(positive_ranks,
desired_pr_ranks)
# Define expected outputs.
expected_precisions = [1.0, 0.4, 0.5]
expected_recalls = [0.333333, 0.666667, 1.0]
# Compare actual versus expected.
self.assertAllClose(precisions, expected_precisions)
self.assertAllClose(recalls, expected_recalls)
def testComputeMetricsWorks(self):
# Define inputs: 3 queries. For the last one, there are no expected images
# to be retrieved
sorted_index_ids = np.array([[4, 2, 0, 1, 3], [0, 2, 4, 1, 3],
[0, 1, 2, 3, 4]])
ground_truth = [{
'ok': np.array([0, 1]),
'junk': np.array([2])
}, {
'ok': np.array([0, 4]),
'junk': np.array([], dtype='int64')
}, {
'ok': np.array([], dtype='int64'),
'junk': np.array([], dtype='int64')
}]
desired_pr_ranks = [1, 2, 5]
# Run tested function.
(mean_average_precision, mean_precisions, mean_recalls, average_precisions,
precisions, recalls) = dataset.ComputeMetrics(sorted_index_ids,
ground_truth,
desired_pr_ranks)
# Define expected outputs.
expected_mean_average_precision = 0.604167
expected_mean_precisions = [0.5, 0.5, 0.666667]
expected_mean_recalls = [0.25, 0.5, 1.0]
expected_average_precisions = [0.416667, 0.791667, float('nan')]
expected_precisions = [[0.0, 0.5, 0.666667], [1.0, 0.5, 0.666667],
[float('nan'),
float('nan'),
float('nan')]]
expected_recalls = [[0.0, 0.5, 1.0], [0.5, 0.5, 1.0],
[float('nan'), float('nan'),
float('nan')]]
# Compare actual versus expected.
self.assertAllClose(mean_average_precision, expected_mean_average_precision)
self.assertAllClose(mean_precisions, expected_mean_precisions)
self.assertAllClose(mean_recalls, expected_mean_recalls)
self.assertAllClose(average_precisions, expected_average_precisions)
self.assertAllClose(precisions, expected_precisions)
self.assertAllClose(recalls, expected_recalls)
if __name__ == '__main__':
tf.test.main()
......@@ -207,7 +207,7 @@ if __name__ == '__main__':
""")
parser.add_argument(
'--use_query_images',
type=bool,
type=lambda x: (str(x).lower() == 'true'),
default=False,
help="""
If True, processes the query images of the dataset. If False, processes
......
......@@ -2,7 +2,7 @@ codebook_size: 65536
feature_dimensionality: 128
aggregation_type: ASMK_STAR
use_l2_normalization: false
codebook_path: "parameters/k65536_codebook_tfckpt/codebook"
codebook_path: "parameters/rparis6k_codebook_65536/k65536_codebook_tfckpt/codebook"
num_assignments: 1
use_regional_aggregation: true
feature_batch_size: 100
......
# Copyright 2019 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Performs image retrieval on Revisited Oxford/Paris datasets."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import os
import sys
import time
import numpy as np
from scipy import spatial
from skimage import measure
from skimage import transform
import tensorflow as tf
from google.protobuf import text_format
from tensorflow.python.platform import app
from delf import aggregation_config_pb2
from delf import datum_io
from delf import feature_aggregation_similarity
from delf import feature_io
from delf.python.detect_to_retrieve import dataset
cmd_args = None
# Aliases for aggregation types.
_VLAD = aggregation_config_pb2.AggregationConfig.VLAD
_ASMK = aggregation_config_pb2.AggregationConfig.ASMK
_ASMK_STAR = aggregation_config_pb2.AggregationConfig.ASMK_STAR
# Extensions.
_DELF_EXTENSION = '.delf'
_VLAD_EXTENSION_SUFFIX = 'vlad'
_ASMK_EXTENSION_SUFFIX = 'asmk'
_ASMK_STAR_EXTENSION_SUFFIX = 'asmk_star'
# Precision-recall ranks to use in metric computation.
_PR_RANKS = (1, 5, 10)
# Pace to log.
_STATUS_CHECK_LOAD_ITERATIONS = 50
_STATUS_CHECK_GV_ITERATIONS = 10
# Output file names.
_METRICS_FILENAME = 'metrics.txt'
# Re-ranking / geometric verification parameters.
_NUM_TO_RERANK = 100
_FEATURE_DISTANCE_THRESHOLD = 0.9
_NUM_RANSAC_TRIALS = 1000
_MIN_RANSAC_SAMPLES = 3
_RANSAC_RESIDUAL_THRESHOLD = 10
def _ReadAggregatedDescriptors(input_dir, image_list, config):
"""Reads aggregated descriptors.
Args:
input_dir: Directory where aggregated descriptors are located.
image_list: List of image names for which to load descriptors.
config: AggregationConfig used for images.
Returns:
aggregated_descriptors: List containing #images items, each a 1D NumPy
array.
visual_words: If using VLAD aggregation, returns an empty list. Otherwise,
returns a list containing #images items, each a 1D NumPy array.
"""
# Compose extension of aggregated descriptors.
extension = '.'
if config.use_regional_aggregation:
extension += 'r'
if config.aggregation_type == _VLAD:
extension += _VLAD_EXTENSION_SUFFIX
elif config.aggregation_type == _ASMK:
extension += _ASMK_EXTENSION_SUFFIX
elif config.aggregation_type == _ASMK_STAR:
extension += _ASMK_STAR_EXTENSION_SUFFIX
else:
raise ValueError('Invalid aggregation type: %d' % config.aggregation_type)
num_images = len(image_list)
aggregated_descriptors = []
visual_words = []
print('Starting to collect descriptors for %d images...' % num_images)
start = time.clock()
for i in range(num_images):
if i > 0 and i % _STATUS_CHECK_LOAD_ITERATIONS == 0:
elapsed = (time.clock() - start)
print('Reading descriptors for image %d out of %d, last %d '
'images took %f seconds' %
(i, num_images, _STATUS_CHECK_LOAD_ITERATIONS, elapsed))
start = time.clock()
descriptors_filename = image_list[i] + extension
descriptors_fullpath = os.path.join(input_dir, descriptors_filename)
if config.aggregation_type == _VLAD:
aggregated_descriptors.append(datum_io.ReadFromFile(descriptors_fullpath))
else:
d, v = datum_io.ReadPairFromFile(descriptors_fullpath)
if config.aggregation_type == _ASMK_STAR:
d = d.astype('uint8')
aggregated_descriptors.append(d)
visual_words.append(v)
return aggregated_descriptors, visual_words
def _MatchFeatures(query_locations, query_descriptors, index_image_locations,
index_image_descriptors):
"""Matches local features using geometric verification.
First, finds putative local feature matches by matching `query_descriptors`
against a KD-tree from the `index_image_descriptors`. Then, attempts to fit an
affine transformation between the putative feature corresponces using their
locations.
Args:
query_locations: Locations of local features for query image. NumPy array of
shape [#query_features, 2].
query_descriptors: Descriptors of local features for query image. NumPy
array of shape [#query_features, depth].
index_image_locations: Locations of local features for index image. NumPy
array of shape [#index_image_features, 2].
index_image_descriptors: Descriptors of local features for index image.
NumPy array of shape [#index_image_features, depth].
Returns:
score: Number of inliers of match. If no match is found, returns 0.
"""
num_features_query = query_locations.shape[0]
num_features_index_image = index_image_locations.shape[0]
if not num_features_query or not num_features_index_image:
return 0
# Find nearest-neighbor matches using a KD tree.
index_image_tree = spatial.cKDTree(index_image_descriptors)
_, indices = index_image_tree.query(
query_descriptors, distance_upper_bound=_FEATURE_DISTANCE_THRESHOLD)
# Select feature locations for putative matches.
query_locations_to_use = np.array([
query_locations[i,]
for i in range(num_features_query)
if indices[i] != num_features_index_image
])
index_image_locations_to_use = np.array([
index_image_locations[indices[i],]
for i in range(num_features_query)
if indices[i] != num_features_index_image
])
# If there are no putative matches, early return 0.
if not query_locations_to_use.shape[0]:
return 0
# Perform geometric verification using RANSAC.
_, inliers = measure.ransac(
(index_image_locations_to_use, query_locations_to_use),
transform.AffineTransform,
min_samples=_MIN_RANSAC_SAMPLES,
residual_threshold=_RANSAC_RESIDUAL_THRESHOLD,
max_trials=_NUM_RANSAC_TRIALS)
if inliers is None:
inliers = []
return sum(inliers)
def _RerankByGeometricVerification(input_ranks, initial_scores, query_name,
index_names, query_features_dir,
index_features_dir, junk_ids):
"""Re-ranks retrieval results using geometric verification.
Args:
input_ranks: 1D NumPy array with indices of top-ranked index images, sorted
from the most to the least similar.
initial_scores: 1D NumPy array with initial similarity scores between query
and index images. Entry i corresponds to score for image i.
query_name: Name for query image (string).
index_names: List of names for index images (strings).
query_features_dir: Directory where query local feature file is located
(string).
index_features_dir: Directory where index local feature files are located
(string).
junk_ids: Set with indices of junk images which should not be considered
during re-ranking.
Returns:
output_ranks: 1D NumPy array with index image indices, sorted from the most
to the least similar according to the geometric verification and initial
scores.
Raises:
ValueError: If `input_ranks`, `initial_scores` and `index_names` do not have
the same number of entries.
"""
num_index_images = len(index_names)
if len(input_ranks) != num_index_images:
raise ValueError('input_ranks and index_names have different number of '
'elements: %d vs %d' %
(len(input_ranks), len(index_names)))
if len(initial_scores) != num_index_images:
raise ValueError('initial_scores and index_names have different number of '
'elements: %d vs %d' %
(len(initial_scores), len(index_names)))
# Filter out junk images from list that will be re-ranked.
input_ranks_for_gv = []
for ind in input_ranks:
if ind not in junk_ids:
input_ranks_for_gv.append(ind)
num_to_rerank = min(_NUM_TO_RERANK, len(input_ranks_for_gv))
# Load query image features.
query_features_path = os.path.join(query_features_dir,
query_name + _DELF_EXTENSION)
query_locations, _, query_descriptors, _, _ = feature_io.ReadFromFile(
query_features_path)
# Initialize list containing number of inliers and initial similarity scores.
inliers_and_initial_scores = []
for i in range(num_index_images):
inliers_and_initial_scores.append([0, initial_scores[i]])
# Loop over top-ranked images and get results.
print('Starting to re-rank')
for i in range(num_to_rerank):
if i > 0 and i % _STATUS_CHECK_GV_ITERATIONS == 0:
print('Re-ranking: i = %d out of %d' % (i, num_to_rerank))
index_image_id = input_ranks_for_gv[i]
# Load index image features.
index_image_features_path = os.path.join(
index_features_dir, index_names[index_image_id] + _DELF_EXTENSION)
(index_image_locations, _, index_image_descriptors, _,
_) = feature_io.ReadFromFile(index_image_features_path)
inliers_and_initial_scores[index_image_id][0] = _MatchFeatures(
query_locations, query_descriptors, index_image_locations,
index_image_descriptors)
# Sort based on (inliers_score, initial_score).
def _InliersInitialScoresSorting(k):
"""Helper function to sort list based on two entries.
Args:
k: Index into `inliers_and_initial_scores`.
Returns:
Tuple containing inlier score and initial score.
"""
return (inliers_and_initial_scores[k][0], inliers_and_initial_scores[k][1])
output_ranks = sorted(
range(num_index_images), key=_InliersInitialScoresSorting, reverse=True)
return output_ranks
def _SaveMetricsFile(mean_average_precision, mean_precisions, mean_recalls,
pr_ranks, output_path):
"""Saves aggregated retrieval metrics to text file.
Args:
mean_average_precision: Dict mapping each dataset protocol to a float.
mean_precisions: Dict mapping each dataset protocol to a NumPy array of
floats with shape [len(pr_ranks)].
mean_recalls: Dict mapping each dataset protocol to a NumPy array of floats
with shape [len(pr_ranks)].
pr_ranks: List of integers.
output_path: Full file path.
"""
with tf.gfile.GFile(output_path, 'w') as f:
for k in sorted(mean_average_precision.keys()):
f.write('{}\n mAP={}\n mP@k{} {}\n mR@k{} {}\n'.format(
k, np.around(mean_average_precision[k] * 100, decimals=2),
np.array(pr_ranks), np.around(mean_precisions[k] * 100, decimals=2),
np.array(pr_ranks), np.around(mean_recalls[k] * 100, decimals=2)))
def main(argv):
if len(argv) > 1:
raise RuntimeError('Too many command-line arguments.')
# Parse dataset to obtain query/index images, and ground-truth.
print('Parsing dataset...')
query_list, index_list, ground_truth = dataset.ReadDatasetFile(
cmd_args.dataset_file_path)
num_query_images = len(query_list)
num_index_images = len(index_list)
(_, medium_ground_truth,
hard_ground_truth) = dataset.ParseEasyMediumHardGroundTruth(ground_truth)
print('done! Found %d queries and %d index images' %
(num_query_images, num_index_images))
# Parse AggregationConfig protos.
query_config = aggregation_config_pb2.AggregationConfig()
with tf.gfile.GFile(cmd_args.query_aggregation_config_path, 'r') as f:
text_format.Merge(f.read(), query_config)
index_config = aggregation_config_pb2.AggregationConfig()
with tf.gfile.GFile(cmd_args.index_aggregation_config_path, 'r') as f:
text_format.Merge(f.read(), index_config)
# Read aggregated descriptors.
query_aggregated_descriptors, query_visual_words = _ReadAggregatedDescriptors(
cmd_args.query_aggregation_dir, query_list, query_config)
index_aggregated_descriptors, index_visual_words = _ReadAggregatedDescriptors(
cmd_args.index_aggregation_dir, index_list, index_config)
# Create similarity computer.
similarity_computer = (
feature_aggregation_similarity.SimilarityAggregatedRepresentation(
index_config))
# Compute similarity between query and index images, potentially re-ranking
# with geometric verification.
ranks_before_gv = np.zeros([num_query_images, num_index_images],
dtype='int32')
if cmd_args.use_geometric_verification:
medium_ranks_after_gv = np.zeros([num_query_images, num_index_images],
dtype='int32')
hard_ranks_after_gv = np.zeros([num_query_images, num_index_images],
dtype='int32')
for i in range(num_query_images):
print('Performing retrieval with query %d (%s)...' % (i, query_list[i]))
start = time.clock()
# Compute similarity between aggregated descriptors.
similarities = np.zeros([num_index_images])
for j in range(num_index_images):
similarities[j] = similarity_computer.ComputeSimilarity(
query_aggregated_descriptors[i], index_aggregated_descriptors[j],
query_visual_words[i], index_visual_words[j])
ranks_before_gv[i] = np.argsort(-similarities)
# Re-rank using geometric verification.
if cmd_args.use_geometric_verification:
medium_ranks_after_gv[i] = _RerankByGeometricVerification(
ranks_before_gv[i], similarities, query_list[i], index_list,
cmd_args.query_features_dir, cmd_args.index_features_dir,
set(medium_ground_truth[i]['junk']))
hard_ranks_after_gv[i] = _RerankByGeometricVerification(
ranks_before_gv[i], similarities, query_list[i], index_list,
cmd_args.query_features_dir, cmd_args.index_features_dir,
set(hard_ground_truth[i]['junk']))
elapsed = (time.clock() - start)
print('done! Retrieval for query %d took %f seconds' % (i, elapsed))
# Create output directory if necessary.
if not os.path.exists(cmd_args.output_dir):
os.makedirs(cmd_args.output_dir)
# Compute metrics.
medium_metrics = dataset.ComputeMetrics(ranks_before_gv, medium_ground_truth,
_PR_RANKS)
hard_metrics = dataset.ComputeMetrics(ranks_before_gv, hard_ground_truth,
_PR_RANKS)
if cmd_args.use_geometric_verification:
medium_metrics_after_gv = dataset.ComputeMetrics(medium_ranks_after_gv,
medium_ground_truth,
_PR_RANKS)
hard_metrics_after_gv = dataset.ComputeMetrics(hard_ranks_after_gv,
hard_ground_truth, _PR_RANKS)
# Write metrics to file.
mean_average_precision_dict = {
'medium': medium_metrics[0],
'hard': hard_metrics[0]
}
mean_precisions_dict = {'medium': medium_metrics[1], 'hard': hard_metrics[1]}
mean_recalls_dict = {'medium': medium_metrics[2], 'hard': hard_metrics[2]}
if cmd_args.use_geometric_verification:
mean_average_precision_dict.update({
'medium_after_gv': medium_metrics_after_gv[0],
'hard_after_gv': hard_metrics_after_gv[0]
})
mean_precisions_dict.update({
'medium_after_gv': medium_metrics_after_gv[1],
'hard_after_gv': hard_metrics_after_gv[1]
})
mean_recalls_dict.update({
'medium_after_gv': medium_metrics_after_gv[2],
'hard_after_gv': hard_metrics_after_gv[2]
})
_SaveMetricsFile(mean_average_precision_dict, mean_precisions_dict,
mean_recalls_dict, _PR_RANKS,
os.path.join(cmd_args.output_dir, _METRICS_FILENAME))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.register('type', 'bool', lambda v: v.lower() == 'true')
parser.add_argument(
'--index_aggregation_config_path',
type=str,
default='/tmp/index_aggregation_config.pbtxt',
help="""
Path to index AggregationConfig proto text file. This is used to load the
aggregated descriptors from the index, and to define the parameters used
in computing similarity for aggregated descriptors.
""")
parser.add_argument(
'--query_aggregation_config_path',
type=str,
default='/tmp/query_aggregation_config.pbtxt',
help="""
Path to query AggregationConfig proto text file. This is only used to load
the aggregated descriptors for the queries.
""")
parser.add_argument(
'--dataset_file_path',
type=str,
default='/tmp/gnd_roxford5k.mat',
help="""
Dataset file for Revisited Oxford or Paris dataset, in .mat format.
""")
parser.add_argument(
'--index_aggregation_dir',
type=str,
default='/tmp/index_aggregation',
help="""
Directory where index aggregated descriptors are located.
""")
parser.add_argument(
'--query_aggregation_dir',
type=str,
default='/tmp/query_aggregation',
help="""
Directory where query aggregated descriptors are located.
""")
parser.add_argument(
'--use_geometric_verification',
type=lambda x: (str(x).lower() == 'true'),
default=False,
help="""
If True, performs re-ranking using local feature-based geometric
verification.
""")
parser.add_argument(
'--index_features_dir',
type=str,
default='/tmp/index_features',
help="""
Only used if `use_geometric_verification` is True.
Directory where index local image features are located, all in .delf
format.
""")
parser.add_argument(
'--query_features_dir',
type=str,
default='/tmp/query_features',
help="""
Only used if `use_geometric_verification` is True.
Directory where query local image features are located, all in .delf
format.
""")
parser.add_argument(
'--output_dir',
type=str,
default='/tmp/retrieval',
help="""
Directory where retrieval output will be written to. A file containing
metrics for this run is saved therein, with file name "metrics.txt".
""")
cmd_args, unparsed = parser.parse_known_args()
app.run(main=main, argv=[sys.argv[0]] + unparsed)
codebook_size: 65536
feature_dimensionality: 128
aggregation_type: ASMK_STAR
codebook_path: "parameters/k65536_codebook_tfckpt/codebook"
codebook_path: "parameters/rparis6k_codebook_65536/k65536_codebook_tfckpt/codebook"
num_assignments: 1
use_regional_aggregation: false
feature_batch_size: 100
# Copyright 2019 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Local feature aggregation similarity computation.
For more details, please refer to the paper:
"Detect-to-Retrieve: Efficient Regional Aggregation for Image Search",
Proc. CVPR'19 (https://arxiv.org/abs/1812.01584).
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from delf import aggregation_config_pb2
# Aliases for aggregation types.
_VLAD = aggregation_config_pb2.AggregationConfig.VLAD
_ASMK = aggregation_config_pb2.AggregationConfig.ASMK
_ASMK_STAR = aggregation_config_pb2.AggregationConfig.ASMK_STAR
class SimilarityAggregatedRepresentation(object):
"""Class for computing similarity of aggregated local feature representations.
Args:
aggregation_config: AggregationConfig object defining type of aggregation to
use.
Raises:
ValueError: If aggregation type is invalid.
"""
def __init__(self, aggregation_config):
self._feature_dimensionality = aggregation_config.feature_dimensionality
self._aggregation_type = aggregation_config.aggregation_type
# Only relevant if using ASMK/ASMK*. Otherwise, ignored.
self._use_l2_normalization = aggregation_config.use_l2_normalization
self._alpha = aggregation_config.alpha
self._tau = aggregation_config.tau
# Only relevant if using ASMK*. Otherwise, ignored.
self._number_bits = np.array([bin(n).count('1') for n in range(256)])
def ComputeSimilarity(self,
aggregated_descriptors_1,
aggregated_descriptors_2,
feature_visual_words_1=None,
feature_visual_words_2=None):
"""Computes similarity between aggregated descriptors.
Args:
aggregated_descriptors_1: 1-D NumPy array.
aggregated_descriptors_2: 1-D NumPy array.
feature_visual_words_1: Used only for ASMK/ASMK* aggregation type. 1-D
sorted NumPy integer array denoting visual words corresponding to
`aggregated_descriptors_1`.
feature_visual_words_2: Used only for ASMK/ASMK* aggregation type. 1-D
sorted NumPy integer array denoting visual words corresponding to
`aggregated_descriptors_2`.
Returns:
similarity: Float. The larger, the more similar.
Raises:
ValueError: If aggregation type is invalid.
"""
if self._aggregation_type == _VLAD:
similarity = np.dot(aggregated_descriptors_1, aggregated_descriptors_2)
elif self._aggregation_type == _ASMK:
similarity = self._AsmkSimilarity(
aggregated_descriptors_1,
aggregated_descriptors_2,
feature_visual_words_1,
feature_visual_words_2,
binarized=False)
elif self._aggregation_type == _ASMK_STAR:
similarity = self._AsmkSimilarity(
aggregated_descriptors_1,
aggregated_descriptors_2,
feature_visual_words_1,
feature_visual_words_2,
binarized=True)
else:
raise ValueError('Invalid aggregation type: %d' % self._aggregation_type)
return similarity
def _CheckAsmkDimensionality(self, aggregated_descriptors, num_visual_words,
descriptor_name):
"""Checks that ASMK dimensionality is as expected.
Args:
aggregated_descriptors: 1-D NumPy array.
num_visual_words: Integer.
descriptor_name: String.
Raises:
ValueError: If descriptor dimensionality is incorrect.
"""
if len(aggregated_descriptors
) / num_visual_words != self._feature_dimensionality:
raise ValueError(
'Feature dimensionality for aggregated descriptor %s is invalid: %d;'
' expected %d.' % (descriptor_name, len(aggregated_descriptors) /
num_visual_words, self._feature_dimensionality))
def _SigmaFn(self, x):
"""Selectivity ASMK/ASMK* similarity function.
Args:
x: Scalar or 1-D NumPy array.
Returns:
result: Same type as input, with output of selectivity function.
"""
if np.isscalar(x):
if x > self._tau:
result = np.sign(x) * np.power(np.absolute(x), self._alpha)
else:
result = 0.0
else:
result = np.zeros_like(x)
above_tau = np.nonzero(x > self._tau)
result[above_tau] = np.sign(x[above_tau]) * np.power(
np.absolute(x[above_tau]), self._alpha)
return result
def _BinaryNormalizedInnerProduct(self, descriptors_1, descriptors_2):
"""Computes normalized binary inner product.
Args:
descriptors_1: 1-D NumPy integer array.
descriptors_2: 1-D NumPy integer array.
Returns:
inner_product: Float.
Raises:
ValueError: If the dimensionality of descriptors is different.
"""
num_descriptors = len(descriptors_1)
if num_descriptors != len(descriptors_2):
raise ValueError(
'Descriptors have incompatible dimensionality: %d vs %d' %
(len(descriptors_1), len(descriptors_2)))
h = 0
for i in range(num_descriptors):
h += self._number_bits[np.bitwise_xor(descriptors_1[i], descriptors_2[i])]
# If local feature dimensionality is lower than 8, then use that to compute
# proper binarized inner product.
bits_per_descriptor = min(self._feature_dimensionality, 8)
total_num_bits = bits_per_descriptor * num_descriptors
return 1.0 - 2.0 * h / total_num_bits
def _AsmkSimilarity(self,
aggregated_descriptors_1,
aggregated_descriptors_2,
visual_words_1,
visual_words_2,
binarized=False):
"""Compute ASMK-based similarity.
If `aggregated_descriptors_1` or `aggregated_descriptors_2` is empty, we
return a similarity of -1.0.
If binarized is True, `aggregated_descriptors_1` and
`aggregated_descriptors_2` must be of type uint8.
Args:
aggregated_descriptors_1: 1-D NumPy array.
aggregated_descriptors_2: 1-D NumPy array.
visual_words_1: 1-D sorted NumPy integer array denoting visual words
corresponding to `aggregated_descriptors_1`.
visual_words_2: 1-D sorted NumPy integer array denoting visual words
corresponding to `aggregated_descriptors_2`.
binarized: If True, compute ASMK* similarity.
Returns:
similarity: Float. The larger, the more similar.
Raises:
ValueError: If input descriptor dimensionality is inconsistent, or if
descriptor type is unsupported.
"""
num_visual_words_1 = len(visual_words_1)
num_visual_words_2 = len(visual_words_2)
if not num_visual_words_1 or not num_visual_words_2:
return -1.0
# Parse dimensionality used per visual word. They must be the same for both
# aggregated descriptors. If using ASMK, they also must be equal to
# self._feature_dimensionality.
if binarized:
if aggregated_descriptors_1.dtype != 'uint8':
raise ValueError('Incorrect input descriptor type: %s' %
aggregated_descriptors_1.dtype)
if aggregated_descriptors_2.dtype != 'uint8':
raise ValueError('Incorrect input descriptor type: %s' %
aggregated_descriptors_2.dtype)
per_visual_word_dimensionality = int(
len(aggregated_descriptors_1) / num_visual_words_1)
if len(aggregated_descriptors_2
) / num_visual_words_2 != per_visual_word_dimensionality:
raise ValueError('ASMK* dimensionality is inconsistent.')
else:
per_visual_word_dimensionality = self._feature_dimensionality
self._CheckAsmkDimensionality(aggregated_descriptors_1,
num_visual_words_1, '1')
self._CheckAsmkDimensionality(aggregated_descriptors_2,
num_visual_words_2, '2')
aggregated_descriptors_1_reshape = np.reshape(
aggregated_descriptors_1,
[num_visual_words_1, per_visual_word_dimensionality])
aggregated_descriptors_2_reshape = np.reshape(
aggregated_descriptors_2,
[num_visual_words_2, per_visual_word_dimensionality])
# Loop over visual words, compute similarity.
unnormalized_similarity = 0.0
ind_1 = 0
ind_2 = 0
while ind_1 < num_visual_words_1 and ind_2 < num_visual_words_2:
if visual_words_1[ind_1] == visual_words_2[ind_2]:
if binarized:
inner_product = self._BinaryNormalizedInnerProduct(
aggregated_descriptors_1_reshape[ind_1],
aggregated_descriptors_2_reshape[ind_2])
else:
inner_product = np.dot(aggregated_descriptors_1_reshape[ind_1],
aggregated_descriptors_2_reshape[ind_2])
unnormalized_similarity += self._SigmaFn(inner_product)
ind_1 += 1
ind_2 += 1
elif visual_words_1[ind_1] > visual_words_2[ind_2]:
ind_2 += 1
else:
ind_1 += 1
final_similarity = unnormalized_similarity
if self._use_l2_normalization:
final_similarity /= np.sqrt(num_visual_words_1 * num_visual_words_2)
return final_similarity
# Copyright 2019 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for DELF feature aggregation similarity."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import tensorflow as tf
from delf import aggregation_config_pb2
from delf import feature_aggregation_similarity
class FeatureAggregationSimilarityTest(tf.test.TestCase):
def testComputeVladSimilarityWorks(self):
# Construct inputs.
vlad_1 = np.array([0, 1, 2, 3, 4])
vlad_2 = np.array([5, 6, 7, 8, 9])
config = aggregation_config_pb2.AggregationConfig()
config.aggregation_type = aggregation_config_pb2.AggregationConfig.VLAD
# Run tested function.
similarity_computer = (
feature_aggregation_similarity.SimilarityAggregatedRepresentation(
config))
similarity = similarity_computer.ComputeSimilarity(vlad_1, vlad_2)
# Define expected results.
exp_similarity = 80
# Compare actual and expected results.
self.assertAllEqual(similarity, exp_similarity)
def testComputeAsmkSimilarityWorks(self):
# Construct inputs.
aggregated_descriptors_1 = np.array([
0.0, 0.0, -0.707107, -0.707107, 0.5, 0.866025, 0.816497, 0.577350, 1.0,
0.0
])
visual_words_1 = np.array([0, 1, 2, 3, 4])
aggregated_descriptors_2 = np.array(
[0.0, 1.0, 1.0, 0.0, 0.707107, 0.707107])
visual_words_2 = np.array([1, 2, 4])
config = aggregation_config_pb2.AggregationConfig()
config.codebook_size = 5
config.feature_dimensionality = 2
config.aggregation_type = aggregation_config_pb2.AggregationConfig.ASMK
config.use_l2_normalization = True
# Run tested function.
similarity_computer = (
feature_aggregation_similarity.SimilarityAggregatedRepresentation(
config))
similarity = similarity_computer.ComputeSimilarity(
aggregated_descriptors_1, aggregated_descriptors_2, visual_words_1,
visual_words_2)
# Define expected results.
exp_similarity = 0.123562
# Compare actual and expected results.
self.assertAllClose(similarity, exp_similarity)
def testComputeAsmkSimilarityNoNormalizationWorks(self):
# Construct inputs.
aggregated_descriptors_1 = np.array([
0.0, 0.0, -0.707107, -0.707107, 0.5, 0.866025, 0.816497, 0.577350, 1.0,
0.0
])
visual_words_1 = np.array([0, 1, 2, 3, 4])
aggregated_descriptors_2 = np.array(
[0.0, 1.0, 1.0, 0.0, 0.707107, 0.707107])
visual_words_2 = np.array([1, 2, 4])
config = aggregation_config_pb2.AggregationConfig()
config.codebook_size = 5
config.feature_dimensionality = 2
config.aggregation_type = aggregation_config_pb2.AggregationConfig.ASMK
config.use_l2_normalization = False
# Run tested function.
similarity_computer = (
feature_aggregation_similarity.SimilarityAggregatedRepresentation(
config))
similarity = similarity_computer.ComputeSimilarity(
aggregated_descriptors_1, aggregated_descriptors_2, visual_words_1,
visual_words_2)
# Define expected results.
exp_similarity = 0.478554
# Compare actual and expected results.
self.assertAllClose(similarity, exp_similarity)
def testComputeAsmkStarSimilarityWorks(self):
# Construct inputs.
aggregated_descriptors_1 = np.array([0, 0, 3, 3, 3], dtype='uint8')
visual_words_1 = np.array([0, 1, 2, 3, 4])
aggregated_descriptors_2 = np.array([1, 2, 3], dtype='uint8')
visual_words_2 = np.array([1, 2, 4])
config = aggregation_config_pb2.AggregationConfig()
config.codebook_size = 5
config.feature_dimensionality = 2
config.aggregation_type = aggregation_config_pb2.AggregationConfig.ASMK_STAR
config.use_l2_normalization = True
# Run tested function.
similarity_computer = (
feature_aggregation_similarity.SimilarityAggregatedRepresentation(
config))
similarity = similarity_computer.ComputeSimilarity(
aggregated_descriptors_1, aggregated_descriptors_2, visual_words_1,
visual_words_2)
# Define expected results.
exp_similarity = 0.258199
# Compare actual and expected results.
self.assertAllClose(similarity, exp_similarity)
if __name__ == '__main__':
tf.test.main()
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment