Unverified Commit bc7c9810 authored by André Araujo's avatar André Araujo Committed by GitHub
Browse files

Add DELG code (#8553)

* Merged commit includes the following changes:
253126424  by Andre Araujo:

    Scripts to compute metrics for Google Landmarks dataset.

    Also, a small fix to metric in retrieval case: avoids duplicate predicted images.

--
253118971  by Andre Araujo:

    Metrics for Google Landmarks dataset.

--
253106953  by Andre Araujo:

    Library to read files from Google Landmarks challenges.

--
250700636  by Andre Araujo:

    Handle case of aggregation extraction with empty set of input features.

--
250516819  by Andre Araujo:

    Add minimum size for DELF extractor.

--
250435822  by Andre Araujo:

    Add max_image_size/min_image_size for open-source DELF proto / module.

--
250414606  by Andre Araujo:

    Refactor extract_aggregation to allow reuse with different datasets.

--
250356863  by Andre Araujo:

    Remove unnecessary cmd_args variable from boxes_and_features_extraction.

--
249783379  by Andre Araujo:

    Create directory for writing mapping file if it does not exist.

--
249581591  by Andre Araujo:

    Refactor scripts to extract boxes and features from images in Revisited datasets.
    Also, change tf.logging.info --> print for easier logging in open source code.

--
249511821  by Andre Araujo:

    Small change to function for file/directory handling.

--
249289499  by Andre Araujo:

    Internal change.

--

PiperOrigin-RevId: 253126424

* Updating DELF init to adjust to latest changes

* Editing init files for python packages

* Edit D2R dataset reader to work with py3.

PiperOrigin-RevId: 253135576

* DELF package: fix import ordering

* Adding new requirements to setup.py

* Adding init file for training dir

* Merged commit includes the following changes:

FolderOrigin-RevId: /google/src/cloud/andrearaujo/delf_oss/google3/..

* Adding init file for training subdirs

* Working version of DELF training

* Internal change.

PiperOrigin-RevId: 253248648

* Fix variance loading in open-source code.

PiperOrigin-RevId: 260619120

* Separate image re-ranking as a standalone library, and add metric writing to dataset library.

PiperOrigin-RevId: 260998608

* Tool to read written D2R Revisited datasets metrics file. Test is added.

Also adds a unit test for previously-existing SaveMetricsFile function.

PiperOrigin-RevId: 263361410

* Add optional resize factor for feature extraction.

PiperOrigin-RevId: 264437080

* Fix NumPy's new version spacing changes.

PiperOrigin-RevId: 265127245

* Maker image matching function visible, and add support for RANSAC seed.

PiperOrigin-RevId: 277177468

* Avoid matplotlib failure due to missing display backend.

PiperOrigin-RevId: 287316435

* Removes tf.contrib dependency.

PiperOrigin-RevId: 288842237

* Fix tf contrib removal for feature_aggregation_extractor.

PiperOrigin-RevId: 289487669

* Merged commit includes the following changes:
309118395  by Andre Araujo:

    Make DELF open-source code compatible with TF2.

--
309067582  by Andre Araujo:

    Handle image resizing rounding properly for python extraction.

    New behavior is tested with unit tests.

--
308690144  by Andre Araujo:

    Several changes to improve DELF model/training code and make it work in TF 2.1.0:
    - Rename some files for better clarity
    - Using compat.v1 versions of functions
    - Formatting changes
    - Using more appropriate TF function names

--
308689397  by Andre Araujo:

    Internal change.

--
308341315  by Andre Araujo:

    Remove old slim dependency in DELF open-source model.

    This avoids issues with requiring old TF-v1, making it compatible with latest TF.

--
306777559  by Andre Araujo:

    Internal change

--
304505811  by Andre Araujo:

    Raise error during geometric verification if local features have different dimensionalities.

--
301739992  by Andre Araujo:

    Transform some geometric verification constants into arguments, to allow custom matching.

--
301300324  by Andre Araujo:

    Apply name change(experimental_run_v2 -> run) for all callers in Tensorflow.

--
299919057  by Andre Araujo:

    Automated refactoring to make code Python 3 compatible.

--
297953698  by Andre Araujo:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
297521242  by Andre Araujo:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
297278247  by Andre Araujo:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
297270405  by Andre Araujo:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
297238741  by Andre Araujo:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
297108605  by Andre Araujo:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
294676131  by Andre Araujo:

    Add option to resize images to square resolutions without aspect ratio preservation.

--
293849641  by Andre Araujo:

    Internal change.

--
293840896  by Andre Araujo:

    Changing Slim import to tf_slim codebase.

--
293661660  by Andre Araujo:

    Allow the delf training script to read from TFRecords dataset.

--
291755295  by Andre Araujo:

    Internal change.

--
291448508  by Andre Araujo:

    Internal change.

--
291414459  by Andre Araujo:

    Adding train script.

--
291384336  by Andre Araujo:

    Adding model export script and test.

--
291260565  by Andre Araujo:

    Adding placeholder for Google Landmarks dataset.

--
291205548  by Andre Araujo:

    Definition of DELF model using Keras ResNet50 as backbone.

--
289500793  by Andre Araujo:

    Add TFRecord building script for delf.

--

PiperOrigin-RevId: 309118395

* Updating README, dependency versions

* Updating training README

* Fixing init import of export_model

* Fixing init import of export_model_utils

* tkinter in INSTALL_INSTRUCTIONS

* Merged commit includes the following changes:

FolderOrigin-RevId: /google/src/cloud/andrearaujo/delf_oss/google3/..

* INSTALL_INSTRUCTIONS mentioning different cloning options

* Updating required TF version, since 2.1 is not available in pip

* Internal change.

PiperOrigin-RevId: 309136003

* Fix missing string_input_producer and start_queue_runners in TF2.

PiperOrigin-RevId: 309437512

* Handle RANSAC from skimage's latest versions.

PiperOrigin-RevId: 310170897

* DELF 2.1 version: badge and setup.py updated

* Add TF version badge in INSTALL_INSTRUCTIONS and paper badges in README

* Add paper badges in paper instructions

* Add paper badge to landmark detection instructions

* Small update to DELF training README

* Merged commit includes the following changes:
312614961  by Andre Araujo:

    Instructions/code to reproduce DELG paper results.

--
312523414  by Andre Araujo:

    Fix a minor bug when post-process extracted features, format config.delf_global_config.image_scales_ind to a list.

--
312340276  by Andre Araujo:

    Add support for global feature extraction in DELF open-source codebase.

--
311031367  by Andre Araujo:

    Add use_square_images as an option in DELF config. The default value is false. if it is set, then images are resized to square resolution before feature extraction (e.g. Starburst use case. ) Thought for a while, whether to have two constructor of DescriptorToImageTemplate, but in the end, decide to only keep one, may be less confusing.

--
310658638  by Andre Araujo:

    Option for producing local feature-based image match visualization.

--

PiperOrigin-RevId: 312614961

* DELF README update / DELG instructions

* DELF README update

* DELG instructions update

* Merged commit includes the following changes:

PiperOrigin-RevId: 312695597
parent 26565d0d
# DELF: DEep Local Features # Deep Local and Global Image Features
[![TensorFlow 2.1](https://img.shields.io/badge/tensorflow-2.1-brightgreen)](https://github.com/tensorflow/tensorflow/releases/tag/v2.1.0) [![TensorFlow 2.1](https://img.shields.io/badge/tensorflow-2.1-brightgreen)](https://github.com/tensorflow/tensorflow/releases/tag/v2.1.0)
[![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg)](https://www.python.org/downloads/release/python-360/) [![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg)](https://www.python.org/downloads/release/python-360/)
This project presents code for extracting DELF features, which were introduced This project presents code for extracting local and global image features, which
with the paper are particularly useful for large-scale instance-level image recognition. These
["Large-Scale Image Retrieval with Attentive Deep Local Features"](https://arxiv.org/abs/1612.06321). were introduced in the [DELF](https://arxiv.org/abs/1612.06321),
It also contains code for the follow-up paper [Detect-to-Retrieve](https://arxiv.org/abs/1812.01584) and
["Detect-to-Retrieve: Efficient Regional Aggregation for Image Search"](https://arxiv.org/abs/1812.01584). [DELG](https://arxiv.org/abs/2001.05027) papers.
We also released pre-trained models based on the We also released pre-trained models based on the
[Google Landmarks dataset](https://www.kaggle.com/google/google-landmarks-dataset). [Google Landmarks dataset](https://www.kaggle.com/google/google-landmarks-dataset).
DELF is particularly useful for large-scale instance-level image recognition. It The pre-trained models released here have been optimized for landmark
detects and describes semantic local features which can be geometrically recognition, so expect it to work well in this area. We also provide tensorflow
verified between images showing the same object instance. The pre-trained models code for building and training models.
released here have been optimized for landmark recognition, so expect it to work
well in this area. We also provide tensorflow code for building the DELF model,
and [NEW] code for model training.
If you make use of this code, please consider citing the following papers: If you make use of this code, please consider citing the following papers:
DELF:
[![Paper](http://img.shields.io/badge/paper-arXiv.1612.06321-B3181B.svg)](https://arxiv.org/abs/1612.06321) [![Paper](http://img.shields.io/badge/paper-arXiv.1612.06321-B3181B.svg)](https://arxiv.org/abs/1612.06321)
``` ```
...@@ -29,8 +27,7 @@ H. Noh, A. Araujo, J. Sim, T. Weyand and B. Han, ...@@ -29,8 +27,7 @@ H. Noh, A. Araujo, J. Sim, T. Weyand and B. Han,
Proc. ICCV'17 Proc. ICCV'17
``` ```
and/or Detect-to-Retrieve:
[![Paper](http://img.shields.io/badge/paper-arXiv.1812.01584-B3181B.svg)](https://arxiv.org/abs/1812.01584) [![Paper](http://img.shields.io/badge/paper-arXiv.1812.01584-B3181B.svg)](https://arxiv.org/abs/1812.01584)
``` ```
...@@ -39,8 +36,19 @@ M. Teichmann*, A. Araujo*, M. Zhu and J. Sim, ...@@ -39,8 +36,19 @@ M. Teichmann*, A. Araujo*, M. Zhu and J. Sim,
Proc. CVPR'19 Proc. CVPR'19
``` ```
DELG:
[![Paper](http://img.shields.io/badge/paper-arXiv.2001.05027-B3181B.svg)](https://arxiv.org/abs/2001.05027)
```
"Unifying Deep Local and Global Features for Image Search",
B. Cao*, A. Araujo* and J. Sim,
arxiv:2001.05027
```
## News ## News
- [Jan'20] Check out our new paper:
["Unifying Deep Local and Global Features for Image Search"](https://arxiv.org/abs/2001.05027)
- [Jun'19] DELF achieved 2nd place in - [Jun'19] DELF achieved 2nd place in
[CVPR Visual Localization challenge (Local Features track)](https://sites.google.com/corp/view/ltvl2019). [CVPR Visual Localization challenge (Local Features track)](https://sites.google.com/corp/view/ltvl2019).
See our slides See our slides
...@@ -123,6 +131,12 @@ should obtain a nice figure showing local feature matches, as: ...@@ -123,6 +131,12 @@ should obtain a nice figure showing local feature matches, as:
Please follow [these instructions](delf/python/training/README.md). Please follow [these instructions](delf/python/training/README.md).
### DELG
Please follow [these instructions](delf/python/delg/DELG_INSTRUCTIONS.md). At
the end, you should obtain image retrieval results on the Revisited Oxford/Paris
datasets.
### Landmark detection ### Landmark detection
Please follow [these instructions](DETECTION.md). At the end, you should obtain Please follow [these instructions](DETECTION.md). At the end, you should obtain
...@@ -182,6 +196,15 @@ feature extraction/matching, and object detection: ...@@ -182,6 +196,15 @@ feature extraction/matching, and object detection:
- `match_images.py` supports image matching using DELF features extracted - `match_images.py` supports image matching using DELF features extracted
using `extract_features.py`. using `extract_features.py`.
The subdirectory `delf/python/delg` contains sample scripts/configs related to
the DELG paper:
- `delg_gld_config.pbtxt` gives the DelfConfig used in DELG paper.
- `extract_features.py` for local+global feature extraction on Revisited
datasets.
- `perform_retrieval.py` for performing retrieval/evaluating methods on
Revisited datasets.
The subdirectory `delf/python/detect_to_retrieve` contains sample The subdirectory `delf/python/detect_to_retrieve` contains sample
scripts/configs related to the Detect-to-Retrieve paper: scripts/configs related to the Detect-to-Retrieve paper:
......
...@@ -51,29 +51,71 @@ message DelfLocalFeatureConfig { ...@@ -51,29 +51,71 @@ message DelfLocalFeatureConfig {
optional DelfPcaParameters pca_parameters = 6; optional DelfPcaParameters pca_parameters = 6;
} }
message DelfGlobalFeatureConfig {
// If PCA is to be used, this must be set to true.
optional bool use_pca = 1 [default = true];
// PCA parameters for DELF global feature. This is used only if use_pca is
// true.
optional DelfPcaParameters pca_parameters = 2;
// Denotes indices of DelfConfig's scales that will be used for global
// descriptor extraction. For example, if DelfConfig's image_scales are
// [0.25, 0.5, 1.0] and image_scales_ind is [0, 2], global descriptor
// extraction will use solely scales [0.25, 1.0]. Note that local feature
// extraction will still use [0.25, 0.5, 1.0] in this case. If empty (default)
// , all scales are used.
repeated int32 image_scales_ind = 3;
}
message DelfConfig { message DelfConfig {
// Whether to extract local features when using the model.
// At least one of {use_local_features, use_global_features} must be true.
optional bool use_local_features = 7 [default = true];
// Configuration used for local features. Note: this is used only if
// use_local_features is true.
optional DelfLocalFeatureConfig delf_local_config = 3;
// Whether to extract global features when using the model.
// At least one of {use_local_features, use_global_features} must be true.
optional bool use_global_features = 8 [default = false];
// Configuration used for global features. Note: this is used only if
// use_global_features is true.
optional DelfGlobalFeatureConfig delf_global_config = 9;
// Path to DELF model. // Path to DELF model.
optional string model_path = 1; // Required. optional string model_path = 1; // Required.
// Image scales to be used. // Image scales to be used.
repeated float image_scales = 2; repeated float image_scales = 2;
// Configuration used for DELF local features. // Image resizing options.
optional DelfLocalFeatureConfig delf_local_config = 3; // - The maximum/minimum image size (in terms of height or width) to be used
// when extracting DELF features. If set to -1 (default), no upper/lower
// The maximum/minimum image size (in terms of height or width) to be used // bound for image size. If use_square_images option is false (default):
// when extracting DELF features. If the height *OR* width is larger than // * If the height *OR* width is larger than max_image_size, it will be
// max_image_size, it will be resized to max_image_size, and the other // resized to max_image_size, and the other dimension will be resized by
// dimension will be resized by preserving the aspect ratio. Similar logic // preserving the aspect ratio.
// applies to min_image_size, if both height *AND* width are smaller than // * If both height *AND* width are smaller than min_image_size, the larger
// min_image_size, the larger side is set to min_image_size. If set to -1 // side is set to min_image_size.
// (default), no image resizing is performed on the corresponding criteria. // - If use_square_images option is true, it needs to be resized to square
// resolution. To be more specific:
// * If the height *OR* width is larger than max_image_size, it is resized
// to square resolution of max_image_size.
// * If both height *AND* width are smaller than min_image_size, it is
// resized to square resolution of min_image_size.
// * Else, if the input image's resolution is not square, it is resized to
// square resolution of the larger side.
// Image resizing is useful when we want to ensure that the input to the image
// pyramid has a reasonable number of pixels, which could have large impact in
// terms of image matching performance.
// When using local features, note that the feature locations and scales will // When using local features, note that the feature locations and scales will
// be consistent with the original image input size. // be consistent with the original image input size.
// Note that when both options are specified (which is a valid and legit use // Note that when both max_image_size and min_image_size are specified
// case), as long as max_image_size >= min_image_size, there's no conflicting // (which is a valid and legit use case), as long as max_image_size >=
// scenario (i.e. never triggers both enlarging / shrinking). // min_image_size, there's no conflicting scenario (i.e. never triggers both
// Bilinear interpolation is used. // enlarging / shrinking). Bilinear interpolation is used.
optional int32 max_image_size = 4 [default = -1]; optional int32 max_image_size = 4 [default = -1];
optional int32 min_image_size = 5 [default = -1]; optional int32 min_image_size = 5 [default = -1];
optional bool use_square_images = 6 [default = false];
} }
## DELG instructions
[![Paper](http://img.shields.io/badge/paper-arXiv.2001.05027-B3181B.svg)](https://arxiv.org/abs/2001.05027)
These instructions can be used to reproduce the results from the
[DELG paper](https://arxiv.org/abs/2001.05027) for the Revisited Oxford/Paris
datasets.
### Download datasets
```bash
mkdir -p ~/delg/data && cd ~/delg/data
# Oxford dataset.
wget http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/oxbuild_images.tgz
mkdir oxford5k_images
tar -xvzf oxbuild_images.tgz -C oxford5k_images/
# Paris dataset. Download and move all images to same directory.
wget http://www.robots.ox.ac.uk/~vgg/data/parisbuildings/paris_1.tgz
wget http://www.robots.ox.ac.uk/~vgg/data/parisbuildings/paris_2.tgz
mkdir paris6k_images_tmp
tar -xvzf paris_1.tgz -C paris6k_images_tmp/
tar -xvzf paris_2.tgz -C paris6k_images_tmp/
mkdir paris6k_images
mv paris6k_images_tmp/paris/*/*.jpg paris6k_images/
# Revisited annotations.
wget http://cmp.felk.cvut.cz/revisitop/data/datasets/roxford5k/gnd_roxford5k.mat
wget http://cmp.felk.cvut.cz/revisitop/data/datasets/rparis6k/gnd_rparis6k.mat
```
### Download model
This is necessary to reproduce the main paper results:
```bash
# From models/research/delf/delf/python/delg
mkdir parameters && cd parameters
# DELG-GLD model.
wget http://storage.googleapis.com/delf/delg_gld_20200520.tar.gz
tar -xvzf delg_gld_20200520.tar.gz
```
### Feature extraction
We present here commands for extraction on `roxford5k`. To extract on `rparis6k`
instead, please edit the arguments accordingly (especially the
`dataset_file_path` argument).
#### Query feature extraction
For query feature extraction, the cropped query image should be used to extract
features, according to the Revisited Oxford/Paris experimental protocol. Note
that this is done in the `extract_features` script, when setting
`image_set=query`.
Query feature extraction can be run as follows:
```bash
# From models/research/delf/delf/python/delg
python3 extract_features.py \
--delf_config_path delg_gld_config.pbtxt \
--dataset_file_path ~/delg/data/gnd_roxford5k.mat \
--images_dir ~/delg/data/oxford5k_images \
--image_set query \
--output_features_dir ~/delg/data/oxford5k_features/query
```
#### Index feature extraction
Run index feature extraction as follows:
```bash
# From models/research/delf/delf/python/delg
python3 extract_features.py \
--delf_config_path delg_gld_config.pbtxt \
--dataset_file_path ~/delg/data/gnd_roxford5k.mat \
--images_dir ~/delg/data/oxford5k_images \
--image_set index \
--output_features_dir ~/delg/data/oxford5k_features/index
```
### Perform retrieval
To run retrieval on `roxford5k`, the following command can be used:
```bash
# From models/research/delf/delf/python/delg
python3 perform_retrieval.py \
--dataset_file_path ~/delg/data/gnd_roxford5k.mat \
--query_features_dir ~/delg/data/oxford5k_features/query \
--index_features_dir ~/delg/data/oxford5k_features/index \
--output_dir ~/delg/results/oxford5k
```
A file with named `metrics.txt` will be written to the path given in
`output_dir`, with retrieval metrics for an experiment where geometric
verification is not used. The contents should look approximately like:
```
hard
mAP=45.11
mP@k[ 1 5 10] [85.71 72.29 60.14]
mR@k[ 1 5 10] [19.15 29.72 36.32]
medium
mAP=69.71
mP@k[ 1 5 10] [95.71 92. 86.86]
mR@k[ 1 5 10] [10.17 25.94 33.83]
```
which are the results presented in Table 3 of the paper.
If you want to run retrieval with geometric verification, set
`use_geometric_verification` to `True`. It's much slower since (1) in this code
example the re-ranking is loading DELF local features from disk, and (2)
re-ranking needs to be performed separately for each dataset protocol, since the
junk images from each protocol should be removed when re-ranking. Here is an
example command:
```bash
# From models/research/delf/delf/python/delg
python3 perform_retrieval.py \
--dataset_file_path ~/delg/data/gnd_roxford5k.mat \
--query_features_dir ~/delg/data/oxford5k_features/query \
--index_features_dir ~/delg/data/oxford5k_features/index \
--use_geometric_verification \
--output_dir ~/delg/results/oxford5k_with_gv
```
The `metrics.txt` should now show:
```
hard
mAP=45.11
mP@k[ 1 5 10] [85.71 72.29 60.14]
mR@k[ 1 5 10] [19.15 29.72 36.32]
hard_after_gv
mAP=53.72
mP@k[ 1 5 10] [91.43 83.81 74.38]
mR@k[ 1 5 10] [19.45 34.45 44.64]
medium
mAP=69.71
mP@k[ 1 5 10] [95.71 92. 86.86]
mR@k[ 1 5 10] [10.17 25.94 33.83]
medium_after_gv
mAP=75.42
mP@k[ 1 5 10] [97.14 95.24 93.81]
mR@k[ 1 5 10] [10.21 27.21 37.72]
```
which, again, are the results presented in Table 3 of the paper.
use_local_features: true
use_global_features: true
model_path: "parameters/delg_gld_20200520"
image_scales: 0.25
image_scales: 0.35355338
image_scales: 0.5
image_scales: 0.70710677
image_scales: 1.0
image_scales: 1.4142135
image_scales: 2.0
delf_local_config {
use_pca: false
max_feature_num: 1000
score_threshold: 175.0
}
delf_global_config {
use_pca: false
image_scales_ind: 3
image_scales_ind: 4
image_scales_ind: 5
}
max_image_size: 1024
# Copyright 2020 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Extracts DELG features for images from Revisited Oxford/Paris datasets.
Note that query images are cropped before feature extraction, as required by the
evaluation protocols of these datasets.
The program checks if features already exist, and skips computation for those.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import time
from absl import app
from absl import flags
import numpy as np
from PIL import Image
from PIL import ImageFile
import tensorflow as tf
from google.protobuf import text_format
from delf import delf_config_pb2
from delf import datum_io
from delf import feature_io
from delf.python.detect_to_retrieve import dataset
from delf import extractor
FLAGS = flags.FLAGS
flags.DEFINE_string(
'delf_config_path', '/tmp/delf_config_example.pbtxt',
'Path to DelfConfig proto text file with configuration to be used for DELG '
'extraction.')
flags.DEFINE_string(
'dataset_file_path', '/tmp/gnd_roxford5k.mat',
'Dataset file for Revisited Oxford or Paris dataset, in .mat format.')
flags.DEFINE_string(
'images_dir', '/tmp/images',
'Directory where dataset images are located, all in .jpg format.')
flags.DEFINE_enum('image_set', 'query', ['query', 'index'],
'Whether to extract features from query or index images.')
flags.DEFINE_string(
'output_features_dir', '/tmp/features',
"Directory where DELG features will be written to. Each image's features "
'will be written to files with same name but different extension: the '
'global feature is written to a file with extension .delg_global and the '
'local features are written to a file with extension .delg_local.')
# Extensions.
_DELG_GLOBAL_EXTENSION = '.delg_global'
_DELG_LOCAL_EXTENSION = '.delg_local'
_IMAGE_EXTENSION = '.jpg'
# To avoid PIL crashing for truncated (corrupted) images.
ImageFile.LOAD_TRUNCATED_IMAGES = True
# Pace to report extraction log.
_STATUS_CHECK_ITERATIONS = 50
def _PilLoader(path):
"""Helper function to read image with PIL.
Args:
path: Path to image to be loaded.
Returns:
PIL image in RGB format.
"""
with tf.io.gfile.GFile(path, 'rb') as f:
img = Image.open(f)
return img.convert('RGB')
def main(argv):
if len(argv) > 1:
raise RuntimeError('Too many command-line arguments.')
# Read list of images from dataset file.
print('Reading list of images from dataset file...')
query_list, index_list, ground_truth = dataset.ReadDatasetFile(
FLAGS.dataset_file_path)
if FLAGS.image_set == 'query':
image_list = query_list
else:
image_list = index_list
num_images = len(image_list)
print('done! Found %d images' % num_images)
# Parse DelfConfig proto.
config = delf_config_pb2.DelfConfig()
with tf.io.gfile.GFile(FLAGS.delf_config_path, 'r') as f:
text_format.Parse(f.read(), config)
# Create output directory if necessary.
if not tf.io.gfile.exists(FLAGS.output_features_dir):
tf.io.gfile.makedirs(FLAGS.output_features_dir)
with tf.Graph().as_default():
with tf.compat.v1.Session() as sess:
# Initialize variables, construct DELG extractor.
init_op = tf.compat.v1.global_variables_initializer()
sess.run(init_op)
extractor_fn = extractor.MakeExtractor(sess, config)
start = time.time()
for i in range(num_images):
if i == 0:
print('Starting to extract features...')
elif i % _STATUS_CHECK_ITERATIONS == 0:
elapsed = (time.time() - start)
print('Processing image %d out of %d, last %d '
'images took %f seconds' %
(i, num_images, _STATUS_CHECK_ITERATIONS, elapsed))
start = time.time()
image_name = image_list[i]
input_image_filename = os.path.join(FLAGS.images_dir,
image_name + _IMAGE_EXTENSION)
output_global_feature_filename = os.path.join(
FLAGS.output_features_dir, image_name + _DELG_GLOBAL_EXTENSION)
output_local_feature_filename = os.path.join(
FLAGS.output_features_dir, image_name + _DELG_LOCAL_EXTENSION)
if tf.io.gfile.exists(
output_global_feature_filename) and tf.io.gfile.exists(
output_local_feature_filename):
print('Skipping %s' % image_name)
continue
pil_im = _PilLoader(input_image_filename)
resize_factor = 1.0
if FLAGS.image_set == 'query':
# Crop query image according to bounding box.
original_image_size = max(pil_im.size)
bbox = [int(round(b)) for b in ground_truth[i]['bbx']]
pil_im = pil_im.crop(bbox)
cropped_image_size = max(pil_im.size)
resize_factor = cropped_image_size / original_image_size
im = np.array(pil_im)
# Extract and save features.
extracted_features = extractor_fn(im, resize_factor)
global_descriptor = extracted_features['global_descriptor']
locations = extracted_features['local_features']['locations']
descriptors = extracted_features['local_features']['descriptors']
feature_scales = extracted_features['local_features']['scales']
attention = extracted_features['local_features']['attention']
datum_io.WriteToFile(global_descriptor, output_global_feature_filename)
feature_io.WriteToFile(output_local_feature_filename, locations,
feature_scales, descriptors, attention)
if __name__ == '__main__':
app.run(main)
# Copyright 2020 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Performs DELG-based image retrieval on Revisited Oxford/Paris datasets."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import time
from absl import app
from absl import flags
import numpy as np
import tensorflow as tf
from delf import datum_io
from delf.python.detect_to_retrieve import dataset
from delf.python.detect_to_retrieve import image_reranking
FLAGS = flags.FLAGS
flags.DEFINE_string(
'dataset_file_path', '/tmp/gnd_roxford5k.mat',
'Dataset file for Revisited Oxford or Paris dataset, in .mat format.')
flags.DEFINE_string('query_features_dir', '/tmp/features/query',
'Directory where query DELG features are located.')
flags.DEFINE_string('index_features_dir', '/tmp/features/index',
'Directory where index DELG features are located.')
flags.DEFINE_boolean(
'use_geometric_verification', False,
'If True, performs re-ranking using local feature-based geometric '
'verification.')
flags.DEFINE_float(
'local_feature_distance_threshold', 1.0,
'Optional, only used if `use_geometric_verification` is True. '
'Distance threshold below which a pair of local descriptors is considered '
'a potential match, and will be fed into RANSAC.')
flags.DEFINE_float(
'ransac_residual_threshold', 20.0,
'Optional, only used if `use_geometric_verification` is True. '
'Residual error threshold for considering matches as inliers, used in '
'RANSAC algorithm.')
flags.DEFINE_string(
'output_dir', '/tmp/retrieval',
'Directory where retrieval output will be written to. A file containing '
"metrics for this run is saved therein, with file name 'metrics.txt'.")
# Extensions.
_DELG_GLOBAL_EXTENSION = '.delg_global'
_DELG_LOCAL_EXTENSION = '.delg_local'
# Precision-recall ranks to use in metric computation.
_PR_RANKS = (1, 5, 10)
# Pace to log.
_STATUS_CHECK_LOAD_ITERATIONS = 50
# Output file names.
_METRICS_FILENAME = 'metrics.txt'
def _ReadDelgGlobalDescriptors(input_dir, image_list):
"""Reads DELG global features.
Args:
input_dir: Directory where features are located.
image_list: List of image names for which to load features.
Returns:
global_descriptors: NumPy array of shape (len(image_list), D), where D
corresponds to the global descriptor dimensionality.
"""
num_images = len(image_list)
global_descriptors = []
print('Starting to collect global descriptors for %d images...' % num_images)
start = time.time()
for i in range(num_images):
if i > 0 and i % _STATUS_CHECK_LOAD_ITERATIONS == 0:
elapsed = (time.time() - start)
print('Reading global descriptors for image %d out of %d, last %d '
'images took %f seconds' %
(i, num_images, _STATUS_CHECK_LOAD_ITERATIONS, elapsed))
start = time.time()
descriptor_filename = image_list[i] + _DELG_GLOBAL_EXTENSION
descriptor_fullpath = os.path.join(input_dir, descriptor_filename)
global_descriptors.append(datum_io.ReadFromFile(descriptor_fullpath))
return np.array(global_descriptors)
def main(argv):
if len(argv) > 1:
raise RuntimeError('Too many command-line arguments.')
# Parse dataset to obtain query/index images, and ground-truth.
print('Parsing dataset...')
query_list, index_list, ground_truth = dataset.ReadDatasetFile(
FLAGS.dataset_file_path)
num_query_images = len(query_list)
num_index_images = len(index_list)
(_, medium_ground_truth,
hard_ground_truth) = dataset.ParseEasyMediumHardGroundTruth(ground_truth)
print('done! Found %d queries and %d index images' %
(num_query_images, num_index_images))
# Read global features.
query_global_features = _ReadDelgGlobalDescriptors(FLAGS.query_features_dir,
query_list)
index_global_features = _ReadDelgGlobalDescriptors(FLAGS.index_features_dir,
index_list)
# Compute similarity between query and index images, potentially re-ranking
# with geometric verification.
ranks_before_gv = np.zeros([num_query_images, num_index_images],
dtype='int32')
if FLAGS.use_geometric_verification:
medium_ranks_after_gv = np.zeros([num_query_images, num_index_images],
dtype='int32')
hard_ranks_after_gv = np.zeros([num_query_images, num_index_images],
dtype='int32')
for i in range(num_query_images):
print('Performing retrieval with query %d (%s)...' % (i, query_list[i]))
start = time.time()
# Compute similarity between global descriptors.
similarities = np.dot(index_global_features, query_global_features[i])
ranks_before_gv[i] = np.argsort(-similarities)
# Re-rank using geometric verification.
if FLAGS.use_geometric_verification:
medium_ranks_after_gv[i] = image_reranking.RerankByGeometricVerification(
input_ranks=ranks_before_gv[i],
initial_scores=similarities,
query_name=query_list[i],
index_names=index_list,
query_features_dir=FLAGS.query_features_dir,
index_features_dir=FLAGS.index_features_dir,
junk_ids=set(medium_ground_truth[i]['junk']),
local_feature_extension=_DELG_LOCAL_EXTENSION,
ransac_seed=0,
feature_distance_threshold=FLAGS.local_feature_distance_threshold,
ransac_residual_threshold=FLAGS.ransac_residual_threshold)
hard_ranks_after_gv[i] = image_reranking.RerankByGeometricVerification(
input_ranks=ranks_before_gv[i],
initial_scores=similarities,
query_name=query_list[i],
index_names=index_list,
query_features_dir=FLAGS.query_features_dir,
index_features_dir=FLAGS.index_features_dir,
junk_ids=set(hard_ground_truth[i]['junk']),
local_feature_extension=_DELG_LOCAL_EXTENSION,
ransac_seed=0,
feature_distance_threshold=FLAGS.local_feature_distance_threshold,
ransac_residual_threshold=FLAGS.ransac_residual_threshold)
elapsed = (time.time() - start)
print('done! Retrieval for query %d took %f seconds' % (i, elapsed))
# Create output directory if necessary.
if not tf.io.gfile.exists(FLAGS.output_dir):
tf.io.gfile.makedirs(FLAGS.output_dir)
# Compute metrics.
medium_metrics = dataset.ComputeMetrics(ranks_before_gv, medium_ground_truth,
_PR_RANKS)
hard_metrics = dataset.ComputeMetrics(ranks_before_gv, hard_ground_truth,
_PR_RANKS)
if FLAGS.use_geometric_verification:
medium_metrics_after_gv = dataset.ComputeMetrics(medium_ranks_after_gv,
medium_ground_truth,
_PR_RANKS)
hard_metrics_after_gv = dataset.ComputeMetrics(hard_ranks_after_gv,
hard_ground_truth, _PR_RANKS)
# Write metrics to file.
mean_average_precision_dict = {
'medium': medium_metrics[0],
'hard': hard_metrics[0]
}
mean_precisions_dict = {'medium': medium_metrics[1], 'hard': hard_metrics[1]}
mean_recalls_dict = {'medium': medium_metrics[2], 'hard': hard_metrics[2]}
if FLAGS.use_geometric_verification:
mean_average_precision_dict.update({
'medium_after_gv': medium_metrics_after_gv[0],
'hard_after_gv': hard_metrics_after_gv[0]
})
mean_precisions_dict.update({
'medium_after_gv': medium_metrics_after_gv[1],
'hard_after_gv': hard_metrics_after_gv[1]
})
mean_recalls_dict.update({
'medium_after_gv': medium_metrics_after_gv[2],
'hard_after_gv': hard_metrics_after_gv[2]
})
dataset.SaveMetricsFile(mean_average_precision_dict, mean_precisions_dict,
mean_recalls_dict, _PR_RANKS,
os.path.join(FLAGS.output_dir, _METRICS_FILENAME))
if __name__ == '__main__':
app.run(main)
...@@ -214,8 +214,11 @@ def ExtractBoxesAndFeaturesToFiles(image_names, image_paths, delf_config_path, ...@@ -214,8 +214,11 @@ def ExtractBoxesAndFeaturesToFiles(image_names, image_paths, delf_config_path,
else: else:
im = np.array(pil_im) im = np.array(pil_im)
(locations_out, descriptors_out, feature_scales_out, extracted_features = delf_extractor_fn(im)
attention_out) = delf_extractor_fn(im) locations_out = extracted_features['local_features']['locations']
descriptors_out = extracted_features['local_features']['descriptors']
feature_scales_out = extracted_features['local_features']['scales']
attention_out = extracted_features['local_features']['attention']
feature_io.WriteToFile(output_feature_filename, locations_out, feature_io.WriteToFile(output_feature_filename, locations_out,
feature_scales_out, descriptors_out, feature_scales_out, descriptors_out,
......
...@@ -112,8 +112,11 @@ def main(argv): ...@@ -112,8 +112,11 @@ def main(argv):
im = np.array(_PilLoader(input_image_filename).crop(bbox)) im = np.array(_PilLoader(input_image_filename).crop(bbox))
# Extract and save features. # Extract and save features.
(locations_out, descriptors_out, feature_scales_out, extracted_features = extractor_fn(im)
attention_out) = extractor_fn(im) locations_out = extracted_features['local_features']['locations']
descriptors_out = extracted_features['local_features']['descriptors']
feature_scales_out = extracted_features['local_features']['scales']
attention_out = extracted_features['local_features']['attention']
feature_io.WriteToFile(output_feature_filename, locations_out, feature_io.WriteToFile(output_feature_filename, locations_out,
feature_scales_out, descriptors_out, feature_scales_out, descriptors_out,
......
...@@ -18,10 +18,13 @@ from __future__ import absolute_import ...@@ -18,10 +18,13 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import io
import os import os
import matplotlib.pyplot as plt
import numpy as np import numpy as np
from scipy import spatial from scipy import spatial
from skimage import feature
from skimage import measure from skimage import measure
from skimage import transform from skimage import transform
...@@ -45,7 +48,11 @@ def MatchFeatures(query_locations, ...@@ -45,7 +48,11 @@ def MatchFeatures(query_locations,
index_image_descriptors, index_image_descriptors,
ransac_seed=None, ransac_seed=None,
feature_distance_threshold=0.9, feature_distance_threshold=0.9,
ransac_residual_threshold=10.0): ransac_residual_threshold=10.0,
query_im_array=None,
index_im_array=None,
query_im_scale_factors=None,
index_im_scale_factors=None):
"""Matches local features using geometric verification. """Matches local features using geometric verification.
First, finds putative local feature matches by matching `query_descriptors` First, finds putative local feature matches by matching `query_descriptors`
...@@ -67,9 +74,21 @@ def MatchFeatures(query_locations, ...@@ -67,9 +74,21 @@ def MatchFeatures(query_locations,
features is considered a potential match, and will be fed into RANSAC. features is considered a potential match, and will be fed into RANSAC.
ransac_residual_threshold: Residual error threshold for considering matches ransac_residual_threshold: Residual error threshold for considering matches
as inliers, used in RANSAC algorithm. as inliers, used in RANSAC algorithm.
query_im_array: Optional. If not None, contains a NumPy array with the query
image, used to produce match visualization, if there is a match.
index_im_array: Optional. Same as `query_im_array`, but for index image.
query_im_scale_factors: Optional. If not None, contains a NumPy array with
the query image scales, used to produce match visualization, if there is a
match. If None and a visualization will be produced, [1.0, 1.0] is used
(ie, feature locations are not scaled).
index_im_scale_factors: Optional. Same as `query_im_scale_factors`, but for
index image.
Returns: Returns:
score: Number of inliers of match. If no match is found, returns 0. score: Number of inliers of match. If no match is found, returns 0.
match_viz_bytes: Encoded image bytes with visualization of the match, if
there is one, and if `query_im_array` and `index_im_array` are properly
set. Otherwise, it's an empty bytes string.
Raises: Raises:
ValueError: If local descriptors from query and index images have different ValueError: If local descriptors from query and index images have different
...@@ -78,7 +97,7 @@ def MatchFeatures(query_locations, ...@@ -78,7 +97,7 @@ def MatchFeatures(query_locations,
num_features_query = query_locations.shape[0] num_features_query = query_locations.shape[0]
num_features_index_image = index_image_locations.shape[0] num_features_index_image = index_image_locations.shape[0]
if not num_features_query or not num_features_index_image: if not num_features_query or not num_features_index_image:
return 0 return 0, b''
local_feature_dim = query_descriptors.shape[1] local_feature_dim = query_descriptors.shape[1]
if index_image_descriptors.shape[1] != local_feature_dim: if index_image_descriptors.shape[1] != local_feature_dim:
...@@ -105,7 +124,7 @@ def MatchFeatures(query_locations, ...@@ -105,7 +124,7 @@ def MatchFeatures(query_locations,
# If there are not enough putative matches, early return 0. # If there are not enough putative matches, early return 0.
if query_locations_to_use.shape[0] <= _MIN_RANSAC_SAMPLES: if query_locations_to_use.shape[0] <= _MIN_RANSAC_SAMPLES:
return 0 return 0, b''
# Perform geometric verification using RANSAC. # Perform geometric verification using RANSAC.
_, inliers = measure.ransac( _, inliers = measure.ransac(
...@@ -115,15 +134,49 @@ def MatchFeatures(query_locations, ...@@ -115,15 +134,49 @@ def MatchFeatures(query_locations,
residual_threshold=ransac_residual_threshold, residual_threshold=ransac_residual_threshold,
max_trials=_NUM_RANSAC_TRIALS, max_trials=_NUM_RANSAC_TRIALS,
random_state=ransac_seed) random_state=ransac_seed)
match_viz_bytes = b''
if inliers is None: if inliers is None:
inliers = [] inliers = []
elif query_im_array is not None and index_im_array is not None:
return sum(inliers) if query_im_scale_factors is None:
query_im_scale_factors = [1.0, 1.0]
if index_im_scale_factors is None:
def RerankByGeometricVerification(input_ranks, initial_scores, query_name, index_im_scale_factors = [1.0, 1.0]
index_names, query_features_dir, inlier_idxs = np.nonzero(inliers)[0]
index_features_dir, junk_ids): _, ax = plt.subplots()
ax.axis('off')
ax.xaxis.set_major_locator(plt.NullLocator())
ax.yaxis.set_major_locator(plt.NullLocator())
plt.subplots_adjust(top=1, bottom=0, right=1, left=0, hspace=0, wspace=0)
plt.margins(0, 0)
feature.plot_matches(
ax,
query_im_array,
index_im_array,
query_locations_to_use * query_im_scale_factors,
index_image_locations_to_use * index_im_scale_factors,
np.column_stack((inlier_idxs, inlier_idxs)),
only_matches=True)
match_viz_io = io.BytesIO()
plt.savefig(match_viz_io, format='jpeg', bbox_inches='tight', pad_inches=0)
match_viz_bytes = match_viz_io.getvalue()
return sum(inliers), match_viz_bytes
def RerankByGeometricVerification(input_ranks,
initial_scores,
query_name,
index_names,
query_features_dir,
index_features_dir,
junk_ids,
local_feature_extension=_DELF_EXTENSION,
ransac_seed=None,
feature_distance_threshold=0.9,
ransac_residual_threshold=10.0):
"""Re-ranks retrieval results using geometric verification. """Re-ranks retrieval results using geometric verification.
Args: Args:
...@@ -139,6 +192,13 @@ def RerankByGeometricVerification(input_ranks, initial_scores, query_name, ...@@ -139,6 +192,13 @@ def RerankByGeometricVerification(input_ranks, initial_scores, query_name,
(string). (string).
junk_ids: Set with indices of junk images which should not be considered junk_ids: Set with indices of junk images which should not be considered
during re-ranking. during re-ranking.
local_feature_extension: String, extension to use for loading local feature
files.
ransac_seed: Seed used by RANSAC. If None (default), no seed is provided.
feature_distance_threshold: Distance threshold below which a pair of local
features is considered a potential match, and will be fed into RANSAC.
ransac_residual_threshold: Residual error threshold for considering matches
as inliers, used in RANSAC algorithm.
Returns: Returns:
output_ranks: 1D NumPy array with index image indices, sorted from the most output_ranks: 1D NumPy array with index image indices, sorted from the most
...@@ -168,7 +228,7 @@ def RerankByGeometricVerification(input_ranks, initial_scores, query_name, ...@@ -168,7 +228,7 @@ def RerankByGeometricVerification(input_ranks, initial_scores, query_name,
# Load query image features. # Load query image features.
query_features_path = os.path.join(query_features_dir, query_features_path = os.path.join(query_features_dir,
query_name + _DELF_EXTENSION) query_name + local_feature_extension)
query_locations, _, query_descriptors, _, _ = feature_io.ReadFromFile( query_locations, _, query_descriptors, _, _ = feature_io.ReadFromFile(
query_features_path) query_features_path)
...@@ -187,13 +247,19 @@ def RerankByGeometricVerification(input_ranks, initial_scores, query_name, ...@@ -187,13 +247,19 @@ def RerankByGeometricVerification(input_ranks, initial_scores, query_name,
# Load index image features. # Load index image features.
index_image_features_path = os.path.join( index_image_features_path = os.path.join(
index_features_dir, index_names[index_image_id] + _DELF_EXTENSION) index_features_dir,
index_names[index_image_id] + local_feature_extension)
(index_image_locations, _, index_image_descriptors, _, (index_image_locations, _, index_image_descriptors, _,
_) = feature_io.ReadFromFile(index_image_features_path) _) = feature_io.ReadFromFile(index_image_features_path)
inliers_and_initial_scores[index_image_id][0] = MatchFeatures( inliers_and_initial_scores[index_image_id][0], _ = MatchFeatures(
query_locations, query_descriptors, index_image_locations, query_locations,
index_image_descriptors) query_descriptors,
index_image_locations,
index_image_descriptors,
ransac_seed=ransac_seed,
feature_distance_threshold=feature_distance_threshold,
ransac_residual_threshold=ransac_residual_threshold)
# Sort based on (inliers_score, initial_score). # Sort based on (inliers_score, initial_score).
def _InliersInitialScoresSorting(k): def _InliersInitialScoresSorting(k):
......
...@@ -122,8 +122,11 @@ def main(unused_argv): ...@@ -122,8 +122,11 @@ def main(unused_argv):
continue continue
# Extract and save features. # Extract and save features.
(locations_out, descriptors_out, feature_scales_out, extracted_features = extractor_fn(im)
attention_out) = extractor_fn(im) locations_out = extracted_features['local_features']['locations']
descriptors_out = extracted_features['local_features']['descriptors']
feature_scales_out = extracted_features['local_features']['scales']
attention_out = extracted_features['local_features']['attention']
feature_io.WriteToFile(out_desc_fullpath, locations_out, feature_io.WriteToFile(out_desc_fullpath, locations_out,
feature_scales_out, descriptors_out, feature_scales_out, descriptors_out,
......
...@@ -30,7 +30,7 @@ _MIN_HEIGHT = 10 ...@@ -30,7 +30,7 @@ _MIN_HEIGHT = 10
_MIN_WIDTH = 10 _MIN_WIDTH = 10
def ResizeImage(image, config, resize_factor=1.0, square_output=False): def ResizeImage(image, config, resize_factor=1.0):
"""Resizes image according to config. """Resizes image according to config.
Args: Args:
...@@ -39,9 +39,6 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False): ...@@ -39,9 +39,6 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False):
resize_factor: Optional float resize factor for the input image. If given, resize_factor: Optional float resize factor for the input image. If given,
the maximum and minimum allowed image sizes in `config` are scaled by this the maximum and minimum allowed image sizes in `config` are scaled by this
factor. Must be non-negative. factor. Must be non-negative.
square_output: If True, the output image's aspect ratio is potentially
distorted and a square image (ie, height=width) is returned. The image is
resized such that the largest image side is used in both dimensions.
Returns: Returns:
resized_image: Uint8 array with resized image. resized_image: Uint8 array with resized image.
...@@ -72,7 +69,7 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False): ...@@ -72,7 +69,7 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False):
scale_factor = max_image_size / largest_side scale_factor = max_image_size / largest_side
elif min_image_size >= 0 and largest_side < min_image_size: elif min_image_size >= 0 and largest_side < min_image_size:
scale_factor = min_image_size / largest_side scale_factor = min_image_size / largest_side
elif square_output and (height != width): elif config.use_square_images and (height != width):
scale_factor = 1.0 scale_factor = 1.0
else: else:
# No resizing needed, early return. # No resizing needed, early return.
...@@ -80,7 +77,7 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False): ...@@ -80,7 +77,7 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False):
# Note that new_shape is in (width, height) format (PIL convention), while # Note that new_shape is in (width, height) format (PIL convention), while
# scale_factors are in (height, width) convention (NumPy convention). # scale_factors are in (height, width) convention (NumPy convention).
if square_output: if config.use_square_images:
new_shape = (int(round(largest_side * scale_factor)), new_shape = (int(round(largest_side * scale_factor)),
int(round(largest_side * scale_factor))) int(round(largest_side * scale_factor)))
else: else:
...@@ -97,7 +94,7 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False): ...@@ -97,7 +94,7 @@ def ResizeImage(image, config, resize_factor=1.0, square_output=False):
def MakeExtractor(sess, config, import_scope=None): def MakeExtractor(sess, config, import_scope=None):
"""Creates a function to extract features from an image. """Creates a function to extract global and/or local features from an image.
Args: Args:
sess: TensorFlow session to use. sess: TensorFlow session to use.
...@@ -107,62 +104,125 @@ def MakeExtractor(sess, config, import_scope=None): ...@@ -107,62 +104,125 @@ def MakeExtractor(sess, config, import_scope=None):
Returns: Returns:
Function that receives an image and returns features. Function that receives an image and returns features.
""" """
# Load model.
tf.compat.v1.saved_model.loader.load( tf.compat.v1.saved_model.loader.load(
sess, [tf.compat.v1.saved_model.tag_constants.SERVING], sess, [tf.compat.v1.saved_model.tag_constants.SERVING],
config.model_path, config.model_path,
import_scope=import_scope) import_scope=import_scope)
import_scope_prefix = import_scope + '/' if import_scope is not None else '' import_scope_prefix = import_scope + '/' if import_scope is not None else ''
# Input tensors.
input_image = sess.graph.get_tensor_by_name('%sinput_image:0' % input_image = sess.graph.get_tensor_by_name('%sinput_image:0' %
import_scope_prefix) import_scope_prefix)
input_score_threshold = sess.graph.get_tensor_by_name('%sinput_abs_thres:0' %
import_scope_prefix)
input_image_scales = sess.graph.get_tensor_by_name('%sinput_scales:0' % input_image_scales = sess.graph.get_tensor_by_name('%sinput_scales:0' %
import_scope_prefix) import_scope_prefix)
if config.use_local_features:
input_score_threshold = sess.graph.get_tensor_by_name(
'%sinput_abs_thres:0' % import_scope_prefix)
input_max_feature_num = sess.graph.get_tensor_by_name( input_max_feature_num = sess.graph.get_tensor_by_name(
'%sinput_max_feature_num:0' % import_scope_prefix) '%sinput_max_feature_num:0' % import_scope_prefix)
# Output tensors.
if config.use_global_features:
raw_global_descriptors = sess.graph.get_tensor_by_name(
'%sglobal_descriptors:0' % import_scope_prefix)
if config.use_local_features:
boxes = sess.graph.get_tensor_by_name('%sboxes:0' % import_scope_prefix) boxes = sess.graph.get_tensor_by_name('%sboxes:0' % import_scope_prefix)
raw_descriptors = sess.graph.get_tensor_by_name('%sfeatures:0' % raw_local_descriptors = sess.graph.get_tensor_by_name('%sfeatures:0' %
import_scope_prefix) import_scope_prefix)
feature_scales = sess.graph.get_tensor_by_name('%sscales:0' % feature_scales = sess.graph.get_tensor_by_name('%sscales:0' %
import_scope_prefix) import_scope_prefix)
attention_with_extra_dim = sess.graph.get_tensor_by_name('%sscores:0' % attention_with_extra_dim = sess.graph.get_tensor_by_name(
import_scope_prefix) '%sscores:0' % import_scope_prefix)
# Post-process extracted features: normalize, PCA (optional), pooling.
if config.use_global_features:
if config.delf_global_config.image_scales_ind:
raw_global_descriptors_selected_scales = tf.gather(
raw_global_descriptors,
list(config.delf_global_config.image_scales_ind))
else:
raw_global_descriptors_selected_scales = raw_global_descriptors
global_descriptors_per_scale = feature_extractor.PostProcessDescriptors(
raw_global_descriptors_selected_scales,
config.delf_global_config.use_pca,
config.delf_global_config.pca_parameters)
unnormalized_global_descriptor = tf.reduce_sum(
global_descriptors_per_scale, axis=0, name='sum_pooling')
global_descriptor = tf.nn.l2_normalize(
unnormalized_global_descriptor, axis=0, name='final_l2_normalization')
if config.use_local_features:
attention = tf.reshape(attention_with_extra_dim, attention = tf.reshape(attention_with_extra_dim,
[tf.shape(attention_with_extra_dim)[0]]) [tf.shape(attention_with_extra_dim)[0]])
locations, local_descriptors = feature_extractor.DelfFeaturePostProcessing(
boxes, raw_local_descriptors, config)
locations, descriptors = feature_extractor.DelfFeaturePostProcessing( def ExtractorFn(image, resize_factor=1.0):
boxes, raw_descriptors, config) """Receives an image and returns DELF global and/or local features.
def ExtractorFn(image):
"""Receives an image and returns DELF features.
If image is too small, returns empty set of features. If image is too small, returns empty features.
Args: Args:
image: Uint8 array with shape (height, width, 3) containing the RGB image. image: Uint8 array with shape (height, width, 3) containing the RGB image.
resize_factor: Optional float resize factor for the input image. If given,
the maximum and minimum allowed image sizes in the config are scaled by
this factor.
Returns: Returns:
Tuple (locations, descriptors, feature_scales, attention) extracted_features: A dict containing the extracted global descriptors
(key 'global_descriptor' mapping to a [D] float array), and/or local
features (key 'local_features' mapping to a dict with keys 'locations',
'descriptors', 'scales', 'attention').
""" """
resized_image, scale_factors = ResizeImage(image, config)
resized_image, scale_factors = ResizeImage(
image, config, resize_factor=resize_factor)
# If the image is too small, returns empty features. # If the image is too small, returns empty features.
if resized_image.shape[0] < _MIN_HEIGHT or resized_image.shape[ if resized_image.shape[0] < _MIN_HEIGHT or resized_image.shape[
1] < _MIN_WIDTH: 1] < _MIN_WIDTH:
return np.array([]), np.array([]), np.array([]), np.array([]) extracted_features = {'global_descriptor': np.array([])}
if config.use_local_features:
extracted_features.update({
'local_features': {
'locations': np.array([]),
'descriptors': np.array([]),
'scales': np.array([]),
'attention': np.array([]),
}
})
return extracted_features
(locations_out, descriptors_out, feature_scales_out, feed_dict = {
attention_out) = sess.run(
[locations, descriptors, feature_scales, attention],
feed_dict={
input_image: resized_image, input_image: resized_image,
input_score_threshold: config.delf_local_config.score_threshold,
input_image_scales: list(config.image_scales), input_image_scales: list(config.image_scales),
input_max_feature_num: config.delf_local_config.max_feature_num }
fetches = {}
if config.use_global_features:
fetches.update({
'global_descriptor': global_descriptor,
})
if config.use_local_features:
feed_dict.update({
input_score_threshold: config.delf_local_config.score_threshold,
input_max_feature_num: config.delf_local_config.max_feature_num,
}) })
rescaled_locations_out = locations_out / scale_factors fetches.update({
'local_features': {
'locations': locations,
'descriptors': local_descriptors,
'scales': feature_scales,
'attention': attention,
}
})
extracted_features = sess.run(fetches, feed_dict=feed_dict)
# Adjust local feature positions due to rescaling.
if config.use_local_features:
extracted_features['local_features']['locations'] /= scale_factors
return (rescaled_locations_out, descriptors_out, feature_scales_out, return extracted_features
attention_out)
return ExtractorFn return ExtractorFn
...@@ -66,10 +66,12 @@ class ExtractorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -66,10 +66,12 @@ class ExtractorTest(tf.test.TestCase, parameterized.TestCase):
# Set up config. # Set up config.
config = delf_config_pb2.DelfConfig( config = delf_config_pb2.DelfConfig(
max_image_size=max_image_size, min_image_size=min_image_size) max_image_size=max_image_size,
min_image_size=min_image_size,
use_square_images=square_output)
resized_image, scale_factors = extractor.ResizeImage( resized_image, scale_factors = extractor.ResizeImage(
image, config, resize_factor, square_output) image, config, resize_factor)
self.assertAllEqual(resized_image.shape, expected_shape) self.assertAllEqual(resized_image.shape, expected_shape)
self.assertAllClose(scale_factors, expected_scale_factors) self.assertAllClose(scale_factors, expected_scale_factors)
...@@ -87,10 +89,12 @@ class ExtractorTest(tf.test.TestCase, parameterized.TestCase): ...@@ -87,10 +89,12 @@ class ExtractorTest(tf.test.TestCase, parameterized.TestCase):
# Set up config. # Set up config.
config = delf_config_pb2.DelfConfig( config = delf_config_pb2.DelfConfig(
max_image_size=max_image_size, min_image_size=min_image_size) max_image_size=max_image_size,
min_image_size=min_image_size,
use_square_images=square_output)
resized_image, scale_factors = extractor.ResizeImage( resized_image, scale_factors = extractor.ResizeImage(
image, config, resize_factor, square_output) image, config, resize_factor)
self.assertAllEqual(resized_image.shape, expected_shape) self.assertAllEqual(resized_image.shape, expected_shape)
self.assertAllClose(scale_factors, expected_scale_factors) self.assertAllClose(scale_factors, expected_scale_factors)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment