TF2 version for global feature model exporting (#8760)

* Merged commit includes the following changes: 253126424 by Andre Araujo: Scripts to compute metrics for Google Landmarks dataset. Also, a small fix to metric in retrieval case: avoids duplicate predicted images. -- 253118971 by Andre Araujo: Metrics for Google Landmarks dataset. -- 253106953 by Andre Araujo: Library to read files from Google Landmarks challenges. -- 250700636 by Andre Araujo: Handle case of aggregation extraction with empty set of input features. -- 250516819 by Andre Araujo: Add minimum size for DELF extractor. -- 250435822 by Andre Araujo: Add max_image_size/min_image_size for open-source DELF proto / module. -- 250414606 by Andre Araujo: Refactor extract_aggregation to allow reuse with different datasets. -- 250356863 by Andre Araujo: Remove unnecessary cmd_args variable from boxes_and_features_extraction. -- 249783379 by Andre Araujo: Create directory for writing mapping file if it does not exist. -- 249581591 by Andre Araujo: Refactor scripts to extract boxes and features from images in Revisited datasets. Also, change tf.logging.info --> print for easier logging in open source code. -- 249511821 by Andre Araujo: Small change to function for file/directory handling. -- 249289499 by Andre Araujo: Internal change. -- PiperOrigin-RevId: 253126424 * Updating DELF init to adjust to latest changes * Editing init files for python packages * Edit D2R dataset reader to work with py3. PiperOrigin-RevId: 253135576 * DELF package: fix import ordering * Adding new requirements to setup.py * Adding init file for training dir * Merged commit includes the following changes: FolderOrigin-RevId: /google/src/cloud/andrearaujo/delf_oss/google3/.. * Adding init file for training subdirs * Working version of DELF training * Internal change. PiperOrigin-RevId: 253248648 * Fix variance loading in open-source code. PiperOrigin-RevId: 260619120 * Separate image re-ranking as a standalone library, and add metric writing to dataset library. PiperOrigin-RevId: 260998608 * Tool to read written D2R Revisited datasets metrics file. Test is added. Also adds a unit test for previously-existing SaveMetricsFile function. PiperOrigin-RevId: 263361410 * Add optional resize factor for feature extraction. PiperOrigin-RevId: 264437080 * Fix NumPy's new version spacing changes. PiperOrigin-RevId: 265127245 * Maker image matching function visible, and add support for RANSAC seed. PiperOrigin-RevId: 277177468 * Avoid matplotlib failure due to missing display backend. PiperOrigin-RevId: 287316435 * Removes tf.contrib dependency. PiperOrigin-RevId: 288842237 * Fix tf contrib removal for feature_aggregation_extractor. PiperOrigin-RevId: 289487669 * Merged commit includes the following changes: 309118395 by Andre Araujo: Make DELF open-source code compatible with TF2. -- 309067582 by Andre Araujo: Handle image resizing rounding properly for python extraction. New behavior is tested with unit tests. -- 308690144 by Andre Araujo: Several changes to improve DELF model/training code and make it work in TF 2.1.0: - Rename some files for better clarity - Using compat.v1 versions of functions - Formatting changes - Using more appropriate TF function names -- 308689397 by Andre Araujo: Internal change. -- 308341315 by Andre Araujo: Remove old slim dependency in DELF open-source model. This avoids issues with requiring old TF-v1, making it compatible with latest TF. -- 306777559 by Andre Araujo: Internal change -- 304505811 by Andre Araujo: Raise error during geometric verification if local features have different dimensionalities. -- 301739992 by Andre Araujo: Transform some geometric verification constants into arguments, to allow custom matching. -- 301300324 by Andre Araujo: Apply name change(experimental_run_v2 -> run) for all callers in Tensorflow. -- 299919057 by Andre Araujo: Automated refactoring to make code Python 3 compatible. -- 297953698 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297521242 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297278247 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297270405 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297238741 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297108605 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 294676131 by Andre Araujo: Add option to resize images to square resolutions without aspect ratio preservation. -- 293849641 by Andre Araujo: Internal change. -- 293840896 by Andre Araujo: Changing Slim import to tf_slim codebase. -- 293661660 by Andre Araujo: Allow the delf training script to read from TFRecords dataset. -- 291755295 by Andre Araujo: Internal change. -- 291448508 by Andre Araujo: Internal change. -- 291414459 by Andre Araujo: Adding train script. -- 291384336 by Andre Araujo: Adding model export script and test. -- 291260565 by Andre Araujo: Adding placeholder for Google Landmarks dataset. -- 291205548 by Andre Araujo: Definition of DELF model using Keras ResNet50 as backbone. -- 289500793 by Andre Araujo: Add TFRecord building script for delf. -- PiperOrigin-RevId: 309118395 * Updating README, dependency versions * Updating training README * Fixing init import of export_model * Fixing init import of export_model_utils * tkinter in INSTALL_INSTRUCTIONS * Merged commit includes the following changes: FolderOrigin-RevId: /google/src/cloud/andrearaujo/delf_oss/google3/.. * INSTALL_INSTRUCTIONS mentioning different cloning options * Updating required TF version, since 2.1 is not available in pip * Internal change. PiperOrigin-RevId: 309136003 * Fix missing string_input_producer and start_queue_runners in TF2. PiperOrigin-RevId: 309437512 * Handle RANSAC from skimage's latest versions. PiperOrigin-RevId: 310170897 * DELF 2.1 version: badge and setup.py updated * Add TF version badge in INSTALL_INSTRUCTIONS and paper badges in README * Add paper badges in paper instructions * Add paper badge to landmark detection instructions * Small update to DELF training README * Merged commit includes the following changes: 312614961 by Andre Araujo: Instructions/code to reproduce DELG paper results. -- 312523414 by Andre Araujo: Fix a minor bug when post-process extracted features, format config.delf_global_config.image_scales_ind to a list. -- 312340276 by Andre Araujo: Add support for global feature extraction in DELF open-source codebase. -- 311031367 by Andre Araujo: Add use_square_images as an option in DELF config. The default value is false. if it is set, then images are resized to square resolution before feature extraction (e.g. Starburst use case. ) Thought for a while, whether to have two constructor of DescriptorToImageTemplate, but in the end, decide to only keep one, may be less confusing. -- 310658638 by Andre Araujo: Option for producing local feature-based image match visualization. -- PiperOrigin-RevId: 312614961 * DELF README update / DELG instructions * DELF README update * DELG instructions update * Merged commit includes the following changes: PiperOrigin-RevId: 312695597 * Merged commit includes the following changes: 312754894 by Andre Araujo: Code edits / instructions to reproduce GLDv2 results. -- PiperOrigin-RevId: 312754894 * Markdown updates after adding GLDv2 stuff * Small updates to DELF README * Clarify that library must be installed before reproducing results * Merged commit includes the following changes: 319114828 by Andre Araujo: Upgrade global feature model exporting to TF2. -- PiperOrigin-RevId: 319114828 * Properly merging README * small edits to README * small edits to README * small edits to README * global feature exporting in training README

TF2 version for global feature model exporting (#8760)
* Merged commit includes the following changes: 253126424 by Andre Araujo: Scripts to compute metrics for Google Landmarks dataset. Also, a small fix to metric in retrieval case: avoids duplicate predicted images. -- 253118971 by Andre Araujo: Metrics for Google Landmarks dataset. -- 253106953 by Andre Araujo: Library to read files from Google Landmarks challenges. -- 250700636 by Andre Araujo: Handle case of aggregation extraction with empty set of input features. -- 250516819 by Andre Araujo: Add minimum size for DELF extractor. -- 250435822 by Andre Araujo: Add max_image_size/min_image_size for open-source DELF proto / module. -- 250414606 by Andre Araujo: Refactor extract_aggregation to allow reuse with different datasets. -- 250356863 by Andre Araujo: Remove unnecessary cmd_args variable from boxes_and_features_extraction. -- 249783379 by Andre Araujo: Create directory for writing mapping file if it does not exist. -- 249581591 by Andre Araujo: Refactor scripts to extract boxes and features from images in Revisited datasets. Also, change tf.logging.info --> print for easier logging in open source code. -- 249511821 by Andre Araujo: Small change to function for file/directory handling. -- 249289499 by Andre Araujo: Internal change. -- PiperOrigin-RevId: 253126424 * Updating DELF init to adjust to latest changes * Editing init files for python packages * Edit D2R dataset reader to work with py3. PiperOrigin-RevId: 253135576 * DELF package: fix import ordering * Adding new requirements to setup.py * Adding init file for training dir * Merged commit includes the following changes: FolderOrigin-RevId: /google/src/cloud/andrearaujo/delf_oss/google3/.. * Adding init file for training subdirs * Working version of DELF training * Internal change. PiperOrigin-RevId: 253248648 * Fix variance loading in open-source code. PiperOrigin-RevId: 260619120 * Separate image re-ranking as a standalone library, and add metric writing to dataset library. PiperOrigin-RevId: 260998608 * Tool to read written D2R Revisited datasets metrics file. Test is added. Also adds a unit test for previously-existing SaveMetricsFile function. PiperOrigin-RevId: 263361410 * Add optional resize factor for feature extraction. PiperOrigin-RevId: 264437080 * Fix NumPy's new version spacing changes. PiperOrigin-RevId: 265127245 * Maker image matching function visible, and add support for RANSAC seed. PiperOrigin-RevId: 277177468 * Avoid matplotlib failure due to missing display backend. PiperOrigin-RevId: 287316435 * Removes tf.contrib dependency. PiperOrigin-RevId: 288842237 * Fix tf contrib removal for feature_aggregation_extractor. PiperOrigin-RevId: 289487669 * Merged commit includes the following changes: 309118395 by Andre Araujo: Make DELF open-source code compatible with TF2. -- 309067582 by Andre Araujo: Handle image resizing rounding properly for python extraction. New behavior is tested with unit tests. -- 308690144 by Andre Araujo: Several changes to improve DELF model/training code and make it work in TF 2.1.0: - Rename some files for better clarity - Using compat.v1 versions of functions - Formatting changes - Using more appropriate TF function names -- 308689397 by Andre Araujo: Internal change. -- 308341315 by Andre Araujo: Remove old slim dependency in DELF open-source model. This avoids issues with requiring old TF-v1, making it compatible with latest TF. -- 306777559 by Andre Araujo: Internal change -- 304505811 by Andre Araujo: Raise error during geometric verification if local features have different dimensionalities. -- 301739992 by Andre Araujo: Transform some geometric verification constants into arguments, to allow custom matching. -- 301300324 by Andre Araujo: Apply name change(experimental_run_v2 -> run) for all callers in Tensorflow. -- 299919057 by Andre Araujo: Automated refactoring to make code Python 3 compatible. -- 297953698 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297521242 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297278247 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297270405 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297238741 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297108605 by Andre Araujo: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 294676131 by Andre Araujo: Add option to resize images to square resolutions without aspect ratio preservation. -- 293849641 by Andre Araujo: Internal change. -- 293840896 by Andre Araujo: Changing Slim import to tf_slim codebase. -- 293661660 by Andre Araujo: Allow the delf training script to read from TFRecords dataset. -- 291755295 by Andre Araujo: Internal change. -- 291448508 by Andre Araujo: Internal change. -- 291414459 by Andre Araujo: Adding train script. -- 291384336 by Andre Araujo: Adding model export script and test. -- 291260565 by Andre Araujo: Adding placeholder for Google Landmarks dataset. -- 291205548 by Andre Araujo: Definition of DELF model using Keras ResNet50 as backbone. -- 289500793 by Andre Araujo: Add TFRecord building script for delf. -- PiperOrigin-RevId: 309118395 * Updating README, dependency versions * Updating training README * Fixing init import of export_model * Fixing init import of export_model_utils * tkinter in INSTALL_INSTRUCTIONS * Merged commit includes the following changes: FolderOrigin-RevId: /google/src/cloud/andrearaujo/delf_oss/google3/.. * INSTALL_INSTRUCTIONS mentioning different cloning options * Updating required TF version, since 2.1 is not available in pip * Internal change. PiperOrigin-RevId: 309136003 * Fix missing string_input_producer and start_queue_runners in TF2. PiperOrigin-RevId: 309437512 * Handle RANSAC from skimage's latest versions. PiperOrigin-RevId: 310170897 * DELF 2.1 version: badge and setup.py updated * Add TF version badge in INSTALL_INSTRUCTIONS and paper badges in README * Add paper badges in paper instructions * Add paper badge to landmark detection instructions * Small update to DELF training README * Merged commit includes the following changes: 312614961 by Andre Araujo: Instructions/code to reproduce DELG paper results. -- 312523414 by Andre Araujo: Fix a minor bug when post-process extracted features, format config.delf_global_config.image_scales_ind to a list. -- 312340276 by Andre Araujo: Add support for global feature extraction in DELF open-source codebase. -- 311031367 by Andre Araujo: Add use_square_images as an option in DELF config. The default value is false. if it is set, then images are resized to square resolution before feature extraction (e.g. Starburst use case. ) Thought for a while, whether to have two constructor of DescriptorToImageTemplate, but in the end, decide to only keep one, may be less confusing. -- 310658638 by Andre Araujo: Option for producing local feature-based image match visualization. -- PiperOrigin-RevId: 312614961 * DELF README update / DELG instructions * DELF README update * DELG instructions update * Merged commit includes the following changes: PiperOrigin-RevId: 312695597 * Merged commit includes the following changes: 312754894 by Andre Araujo: Code edits / instructions to reproduce GLDv2 results. -- PiperOrigin-RevId: 312754894 * Markdown updates after adding GLDv2 stuff * Small updates to DELF README * Clarify that library must be installed before reproducing results * Merged commit includes the following changes: 319114828 by Andre Araujo: Upgrade global feature model exporting to TF2. -- PiperOrigin-RevId: 319114828 * Properly merging README * small edits to README * small edits to README * small edits to README * global feature exporting in training README
30bac445 · André Araujo · GitHub · 04c0409c · 30bac445 · 30bac445
Unverified Commit 30bac445 authored Jun 30, 2020 by André Araujo Committed by GitHub Jun 30, 2020
3 changed files
--- a/research/delf/delf/python/training/README.md
+++ b/research/delf/delf/python/training/README.md
 # DELF Training Instructions
-This README documents the end-to-end process for training a landmark detection and retrieval
+This README documents the end-to-end process for training a landmark detection
-model using the DELF library on the [Google Landmarks Dataset v2](https://github.com/cvdfoundation/google-landmark) (GLDv2). This can be achieved following these steps:
+and retrieval model using the DELF library on the
-1. Install the DELF Python library.
+[Google Landmarks Dataset v2](https://github.com/cvdfoundation/google-landmark)
-2. Download the raw images of the GLDv2 dataset.
+(GLDv2). This can be achieved following these steps:
-3. Prepare the training data.
-4. Run the training.
+1.  Install the DELF Python library.
+2.  Download the raw images of the GLDv2 dataset.
+3.  Prepare the training data.
+4.  Run the training.
 The next sections will cove each of these steps in greater detail.
 ## Prerequisites
-Clone the [TensorFlow Model Garden](https://github.com/tensorflow/models) repository and move
+Clone the [TensorFlow Model Garden](https://github.com/tensorflow/models)
-into the `models/research/delf/delf/python/training`folder.
+repository and move into the `models/research/delf/delf/python/training`folder.
 ```
 git clone https://github.com/tensorflow/models.git
 cd models/research/delf/delf/python/training
@@ -20,149 +24,210 @@ cd models/research/delf/delf/python/training
 ## Install the DELF Library
-The DELF Python library can be installed by running the [`install_delf.sh`](./install_delf.sh)
+The DELF Python library can be installed by running the
-script using the command:
+[`install_delf.sh`](./install_delf.sh) script using the command:
 ```
 bash install_delf.sh
 ```
-The script installs both the DELF library and its dependencies in the following sequence:
-* Install TensorFlow 2.2 and TensorFlow 2.2 for GPU.
-* Install the [TF-Slim](https://github.com/google-research/tf-slim) library from source.
-* Download [protoc](https://github.com/protocolbuffers/protobuf) and compile the DELF Protocol
-Buffers.
-* Install the matplotlib, numpy, scikit-image, scipy and python3-tk Python libraries.
-* Install the [TensorFlow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection) from the cloned TensorFlow Model Garden repository.
-* Install the DELF package.
-*Please note that the current installation only works on 64 bits Linux architectures due to the 
+The script installs both the DELF library and its dependencies in the following
-`protoc` binary downloaded by the installation script. If you wish to install the DELF library on
+sequence:
-other architectures please update the [`install_delf.sh`](./install_delf.sh) script by referencing
-the desired `protoc` [binary release](https://github.com/protocolbuffers/protobuf/releases).*
+*   Install TensorFlow 2.2 and TensorFlow 2.2 for GPU.
+*   Install the [TF-Slim](https://github.com/google-research/tf-slim) library
+    from source.
+*   Download [protoc](https://github.com/protocolbuffers/protobuf) and compile
+    the DELF Protocol Buffers.
+*   Install the matplotlib, numpy, scikit-image, scipy and python3-tk Python
+    libraries.
+*   Install the
+    [TensorFlow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection)
+    from the cloned TensorFlow Model Garden repository.
+*   Install the DELF package.
+*Please note that the current installation only works on 64 bits Linux
+architectures due to the `protoc` binary downloaded by the installation script.
+If you wish to install the DELF library on other architectures please update the
+[`install_delf.sh`](./install_delf.sh) script by referencing the desired
+`protoc`
+[binary release](https://github.com/protocolbuffers/protobuf/releases).*
 ## Download the GLDv2 Training Data
-The [GLDv2](https://github.com/cvdfoundation/google-landmark) images are grouped in 3 datasets: TRAIN, INDEX, TEST. Images in each dataset are grouped into `*.tar` files and individually
+The [GLDv2](https://github.com/cvdfoundation/google-landmark) images are grouped
-referenced in `*.csv`files containing training metadata and licensing information. The number of
+in 3 datasets: TRAIN, INDEX, TEST. Images in each dataset are grouped into
-`*.tar` files per dataset is as follows:
+`*.tar` files and individually referenced in `*.csv`files containing training
-* TRAIN: 500 files.
+metadata and licensing information. The number of `*.tar` files per dataset is
-* INDEX: 100 files.
+as follows:
-* TEST: 20 files.
+*   TRAIN: 500 files.
+*   INDEX: 100 files.
+*   TEST: 20 files.
+To download the GLDv2 images, run the
+[`download_dataset.sh`](./download_dataset.sh) script like in the following
+example:
-To download the GLDv2 images, run the [`download_dataset.sh`](./download_dataset.sh) script like in
-the following example:
 ```
 bash download_dataset.sh 500 100 20
 ```
 The script takes the following parameters, in order:
-* The number of image files from the TRAIN dataset to download (maximum 500).
-* The number of image files from the INDEX dataset to download (maximum 100).
+*   The number of image files from the TRAIN dataset to download (maximum 500).
-* The number of image files from the TEST dataset to download (maximum 20).
+*   The number of image files from the INDEX dataset to download (maximum 100).
+*   The number of image files from the TEST dataset to download (maximum 20).
 The script downloads the GLDv2 images under the following directory structure:
-* gldv2_dataset/
-  * train/ - Contains raw images from the TRAIN dataset.
+*   gldv2_dataset/
-  * index/ - Contains raw images from the INDEX dataset.
+    *   train/ - Contains raw images from the TRAIN dataset.
-  * test/ - Contains raw images from the TEST dataset.
+    *   index/ - Contains raw images from the INDEX dataset.
+    *   test/ - Contains raw images from the TEST dataset.
-Each of the three folders `gldv2_dataset/train/`, `gldv2_dataset/index/` and `gldv2_dataset/test/`
-contains the following:
+Each of the three folders `gldv2_dataset/train/`, `gldv2_dataset/index/` and
-* The downloaded `*.tar` files.
+`gldv2_dataset/test/` contains the following:
-* The corresponding MD5 checksum files, `*.txt`.
-* The unpacked content of the downloaded files. (*Images are organized in folders and subfolders
+*   The downloaded `*.tar` files.
-based on the first, second and third character in their file name.*)
+*   The corresponding MD5 checksum files, `*.txt`.
-* The CSV files containing training and licensing metadata of the downloaded images.
+*   The unpacked content of the downloaded files. (*Images are organized in
+    folders and subfolders based on the first, second and third character in
-*Please note that due to the large size of the GLDv2 dataset, the download can take up to 12 
+    their file name.*)
-hours and up to 1 TB of disk space. In order to save bandwidth and disk space, you may want to start by downloading only the TRAIN dataset, the only one required for the training, thus saving
+*   The CSV files containing training and licensing metadata of the downloaded
-approximately ~95 GB, the equivalent of the INDEX and TEST datasets. To further save disk space,
+    images.
-the `*.tar` files can be deleted after downloading and upacking them.*
+*Please note that due to the large size of the GLDv2 dataset, the download can
+take up to 12 hours and up to 1 TB of disk space. In order to save bandwidth and
+disk space, you may want to start by downloading only the TRAIN dataset, the
+only one required for the training, thus saving approximately ~95 GB, the
+equivalent of the INDEX and TEST datasets. To further save disk space, the
+`*.tar` files can be deleted after downloading and upacking them.*
 ## Prepare the Data for Training
-Preparing the data for training consists of creating [TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord)
+Preparing the data for training consists of creating
-files from the raw GLDv2 images grouped into TRAIN and VALIDATION splits. The training set
+[TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) files from
-produced contains only the *clean* subset of the GLDv2 dataset. The [CVPR'20 paper](https://arxiv.org/abs/2004.01804)
+the raw GLDv2 images grouped into TRAIN and VALIDATION splits. The training set
-introducing the GLDv2 dataset contains a detailed description of the *clean* subset.
+produced contains only the *clean* subset of the GLDv2 dataset. The
+[CVPR'20 paper](https://arxiv.org/abs/2004.01804) introducing the GLDv2 dataset
+contains a detailed description of the *clean* subset.
+Generating the TFRecord files containing the TRAIN and VALIDATION splits of the
+*clean* GLDv2 subset can be achieved by running the
+[`build_image_dataset.py`](./build_image_dataset.py) script. Assuming that the
+GLDv2 images have been downloaded to the `gldv2_dataset` folder, the script can
+be run as follows:
-Generating the TFRecord files containing the TRAIN and VALIDATION splits of the *clean* GLDv2 
-subset can be achieved by running the [`build_image_dataset.py`](./build_image_dataset.py) 
-script. Assuming that the GLDv2 images have been downloaded to the `gldv2_dataset` folder, the 
-script can be run as follows:
 ```
 python3 build_image_dataset.py \
-    --train_csv_path=gldv2_dataset/train/train.csv \
+  --train_csv_path=gldv2_dataset/train/train.csv \
-    --train_clean_csv_path=gldv2_dataset/train/train_clean.csv \
+  --train_clean_csv_path=gldv2_dataset/train/train_clean.csv \
-    --train_directory=gldv2_dataset/train/*/*/*/ \
+  --train_directory=gldv2_dataset/train/*/*/*/ \
-    --output_directory=gldv2_dataset/tfrecord/ \
+  --output_directory=gldv2_dataset/tfrecord/ \
-    --num_shards=128 \
+  --num_shards=128 \
-    --generate_train_validation_splits \
+  --generate_train_validation_splits \
-    --validation_split_size=0.2
+  --validation_split_size=0.2
 ```
-*Please refer to the source code of the [`build_image_dataset.py`](./build_image_dataset.py) script for a detailed description of its parameters.*
-The TFRecord files written in the `OUTPUT_DIRECTORY` will be prefixed as follows:
+*Please refer to the source code of the
-* TRAIN split: `train-*`
+[`build_image_dataset.py`](./build_image_dataset.py) script for a detailed
-* VALIDATION split: `validation-*`
+description of its parameters.*
+The TFRecord files written in the `OUTPUT_DIRECTORY` will be prefixed as
+follows:
+*   TRAIN split: `train-*`
+*   VALIDATION split: `validation-*`
+The same script can be used to generate TFRecord files for the TEST split for
+post-training evaluation purposes. This can be achieved by adding the
+parameters:
-The same script can be used to generate TFRecord files for the TEST split for post-training
-evaluation purposes. This can be achieved by adding the parameters:
 ```
-    --test_csv_path=gldv2_dataset/train/test.csv \
+--test_csv_path=gldv2_dataset/train/test.csv \
-    --test_directory=gldv2_dataset/test/*/*/*/ \
+--test_directory=gldv2_dataset/test/*/*/*/ \
 ```
-In this scenario, the TFRecord files of the TEST split written in the `OUTPUT_DIRECTORY` will be
-named according to the pattern `test-*`.
-*Please note that due to the large size of the GLDv2 dataset, the generation of the TFRecord 
+In this scenario, the TFRecord files of the TEST split written in the
-files can take up to 12 hours and up to 500 GB of space disk.*
+`OUTPUT_DIRECTORY` will be named according to the pattern `test-*`.
+*Please note that due to the large size of the GLDv2 dataset, the generation of
+the TFRecord files can take up to 12 hours and up to 500 GB of space disk.*
 ## Running the Training
-For the training to converge faster, it is possible to initialize the ResNet backbone with the
+For the training to converge faster, it is possible to initialize the ResNet
-weights of a pretrained ImageNet model. The ImageNet checkpoint is available at the following
+backbone with the weights of a pretrained ImageNet model. The ImageNet
-location: [`http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz`](http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz).
+checkpoint is available at the following location:
+[`http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz`](http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz).
 To download and unpack it run the following commands on a Linux box:
 ```
 curl -Os http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz
 tar -xzvf resnet50_imagenet_weights.tar.gz
 ```
-Assuming the TFRecord files were generated in the `gldv2_dataset/tfrecord/` directory, running 
+Assuming the TFRecord files were generated in the `gldv2_dataset/tfrecord/`
-the following command should start training a model and output the results in the `gldv2_training`
+directory, running the following command should start training a model and
-directory:
+output the results in the `gldv2_training` directory:
 ```
 python3 train.py \
  --train_file_pattern=gldv2_dataset/tfrecord/train* \
-  --validation_file_pattern=gldv2_dataset/tfrecord/validation*
+  --validation_file_pattern=gldv2_dataset/tfrecord/validation* \
  --imagenet_checkpoint=resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 \
  --dataset_version=gld_v2_clean \
  --logdir=gldv2_training/
 ```
-On a multi-GPU machine the batch size can be increased to speed up the training using the `--batch_size` parameter. On a 8 Tesla P100 GPUs machine you can set the batch size to `256`:
+On a multi-GPU machine the batch size can be increased to speed up the training
+using the `--batch_size` parameter. On a 8 Tesla P100 GPUs machine you can set
+the batch size to `256`:
 ```
 --batch_size=256
 ```
 ## Exporting the Trained Model
-Assuming the training output, the TensorFlow checkpoint, is in the `gldv2_training` directory,
+Assuming the training output, the TensorFlow checkpoint, is in the
-running the following command exports the model in the `gldv2_model` directory:
+`gldv2_training` directory, running the following commands exports the model.
+### DELF local feature model
 ```
 python3 model/export_model.py \
  --ckpt_path=gldv2_training/delf_weights \
-  --export_path=gldv2_model \
+  --export_path=gldv2_model_local \
  --block3_strides
 ```
+### Kaggle-compatible global feature model
+To export a global feature model in the format required by the
+[2020 Landmark Retrieval challenge](https://www.kaggle.com/c/landmark-retrieval-2020),
+you can use the following command:
+```
+python3 model/export_global_model.py \
+  --ckpt_path=gldv2_training/delf_weights \
+  --export_path=gldv2_model_global \
+  --input_scales_list=0.70710677,1.0,1.4142135 \
+  --multi_scale_pool_type=sum \
+  --normalize_global_descriptor
+```
 ## Testing the Trained Model
-After the trained model has been exported, it can be used to extract DELF features from 2 images 
+After the trained model has been exported, it can be used to extract DELF
-of the same landmark and to perform a matching test between the 2 images based on the extracted 
+features from 2 images of the same landmark and to perform a matching test
-features to validate they represent the same landmark.
+between the 2 images based on the extracted features to validate they represent
+the same landmark.
 Start by downloading the Oxford buildings dataset:
-````
+```
 mkdir data && cd data
 wget http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/oxbuild_images.tgz
 mkdir oxford5k_images oxford5k_features
@@ -170,19 +235,22 @@ tar -xvzf oxbuild_images.tgz -C oxford5k_images/
 cd ../
 echo data/oxford5k_images/hertford_000056.jpg >> list_images.txt
 echo data/oxford5k_images/oxford_000317.jpg >> list_images.txt
-````
+```
-Make a copy of the [`delf_config_example.pbtxt`](../examples/delf_config_example.pbtxt) 
+Make a copy of the
-protobuffer file which configures the DELF feature extraction. Update the file by making the
+[`delf_config_example.pbtxt`](../examples/delf_config_example.pbtxt) protobuffer
+file which configures the DELF feature extraction. Update the file by making the
 following changes:
-* set the `model_path` attribute to the directory containing the exported model, `gldv2_model`
-  in this example
+*   set the `model_path` attribute to the directory containing the exported
-* add at the root level the attribute `is_tf2_exported` with the value `true`
+    model, `gldv2_model_local` in this example
-* set to `false` the `use_pca` attribute inside `delf_local_config`
+*   add at the root level the attribute `is_tf2_exported` with the value `true`
+*   set to `false` the `use_pca` attribute inside `delf_local_config`
 The ensuing file should resemble the following:
 ```
-model_path: "gldv2_model"
+model_path: "gldv2_model_local"
 image_scales: .25
 image_scales: .3536
 image_scales: .5
@@ -198,8 +266,9 @@ delf_local_config {
 }
 ```
-Run the following command to extract DELF features for the images `hertford_000056.jpg` and
+Run the following command to extract DELF features for the images
-`oxford_000317.jpg`:
+`hertford_000056.jpg` and `oxford_000317.jpg`:
 ```
 python3 ../examples/extract_features.py \
  --config_path delf_config_example.pbtxt \
@@ -207,8 +276,9 @@ python3 ../examples/extract_features.py \
  --output_dir data/oxford5k_features
 ```
-Run the following command to perform feature matching between the images `hertford_000056.jpg` 
+Run the following command to perform feature matching between the images
-and `oxford_000317.jpg`:
+`hertford_000056.jpg` and `oxford_000317.jpg`:
 ```
 python3 ../examples/match_images.py \
  --image_1_path data/oxford5k_images/hertford_000056.jpg \

--- a/research/delf/delf/python/training/model/export_global_model.py
+++ b/research/delf/delf/python/training/model/export_global_model.py
@@ -52,94 +52,103 @@ flags.DEFINE_boolean('normalize_global_descriptor', False,
                     'If True, L2-normalizes global descriptor.')
-def _build_tensor_info(tensor_dict):
+class _ExtractModule(tf.Module):
-  """Replace the dict's value by the tensor info.
+  """Helper module to build and save global feature model."""
-  Args:
+  def __init__(self,
-    tensor_dict: A dictionary contains <string, tensor>.
+               multi_scale_pool_type='None',
+               normalize_global_descriptor=False,
-  Returns:
+               input_scales_tensor=None):
-    dict: New dictionary contains <string, tensor_info>.
+    """Initialization of global feature model.
-  """
-  return {
+    Args:
-      k: tf.compat.v1.saved_model.utils.build_tensor_info(t)
+      multi_scale_pool_type: Type of multi-scale pooling to perform.
-      for k, t in tensor_dict.items()
+      normalize_global_descriptor: Whether to L2-normalize global descriptor.
-  }
+      input_scales_tensor: If None, the exported function to be used should be
+        ExtractFeatures, where an input end-point "input_scales" is added for
+        the exported model. If not None, the specified 1D tensor of floats will
-def main(argv):
+        be hard-coded as the desired input scales, in conjunction with
-  if len(argv) > 1:
+        ExtractFeaturesFixedScales.
-    raise app.UsageError('Too many command-line arguments.')
+    """
+    self._multi_scale_pool_type = multi_scale_pool_type
-  export_path = FLAGS.export_path
+    self._normalize_global_descriptor = normalize_global_descriptor
-  if os.path.exists(export_path):
+    if input_scales_tensor is None:
-    raise ValueError('Export_path already exists.')
+      self._input_scales_tensor = []
+    else:
-  with tf.Graph().as_default() as g, tf.compat.v1.Session(graph=g) as sess:
+      self._input_scales_tensor = input_scales_tensor
-    # Setup the model for extraction.
+    # Setup the DELF model for extraction.
-    model = delf_model.Delf(block3_strides=False, name='DELF')
+    self._model = delf_model.Delf(block3_strides=False, name='DELF')
-    # Initial forward pass to build model.
+  def LoadWeights(self, checkpoint_path):
-    images = tf.zeros((1, 321, 321, 3), dtype=tf.float32)
+    self._model.load_weights(checkpoint_path)
-    model(images)
-    # Setup the multiscale extraction.
-    input_image = tf.compat.v1.placeholder(
-        tf.uint8, shape=(None, None, 3), name='input_image')
-    if FLAGS.input_scales_list is None:
-      input_scales = tf.compat.v1.placeholder(
-          tf.float32, shape=[None], name='input_scales')
-    else:
-      input_scales = tf.constant([float(s) for s in FLAGS.input_scales_list],
-                                 dtype=tf.float32,
-                                 shape=[len(FLAGS.input_scales_list)],
-                                 name='input_scales')
+  @tf.function(input_signature=[
+      tf.TensorSpec(shape=[None, None, 3], dtype=tf.uint8, name='input_image'),
+      tf.TensorSpec(shape=[None], dtype=tf.float32, name='input_scales'),
+      tf.TensorSpec(
+          shape=[None], dtype=tf.int32, name='input_global_scales_ind')
+  ])
+  def ExtractFeatures(self, input_image, input_scales, input_global_scales_ind):
    extracted_features = export_model_utils.ExtractGlobalFeatures(
        input_image,
        input_scales,
-        lambda x: model.backbone(x, training=False),
+        input_global_scales_ind,
-        multi_scale_pool_type=FLAGS.multi_scale_pool_type,
+        lambda x: self._model.backbone.build_call(x, training=False),
-        normalize_global_descriptor=FLAGS.normalize_global_descriptor)
+        multi_scale_pool_type=self._multi_scale_pool_type,
+        normalize_global_descriptor=self._normalize_global_descriptor)
-    # Load the weights.
-    checkpoint_path = FLAGS.ckpt_path
-    model.load_weights(checkpoint_path)
-    print('Checkpoint loaded from ', checkpoint_path)
-    named_input_tensors = {'input_image': input_image}
-    if FLAGS.input_scales_list is None:
-      named_input_tensors['input_scales'] = input_scales
-    # Outputs to the exported model.
    named_output_tensors = {}
-    if FLAGS.multi_scale_pool_type == 'None':
+    if self._multi_scale_pool_type == 'None':
      named_output_tensors['global_descriptors'] = tf.identity(
          extracted_features, name='global_descriptors')
    else:
      named_output_tensors['global_descriptor'] = tf.identity(
          extracted_features, name='global_descriptor')
-    # Export the model.
+    return named_output_tensors
-    signature_def = (
-        tf.compat.v1.saved_model.signature_def_utils.build_signature_def(
+  @tf.function(input_signature=[
-            inputs=_build_tensor_info(named_input_tensors),
+      tf.TensorSpec(shape=[None, None, 3], dtype=tf.uint8, name='input_image')
-            outputs=_build_tensor_info(named_output_tensors)))
+  ])
+  def ExtractFeaturesFixedScales(self, input_image):
-    print('Exporting trained model to:', export_path)
+    return self.ExtractFeatures(input_image, self._input_scales_tensor,
-    builder = tf.compat.v1.saved_model.builder.SavedModelBuilder(export_path)
+                                tf.range(tf.size(self._input_scales_tensor)))
-    init_op = None
-    builder.add_meta_graph_and_variables(
+def main(argv):
-        sess, [tf.compat.v1.saved_model.tag_constants.SERVING],
+  if len(argv) > 1:
-        signature_def_map={
+    raise app.UsageError('Too many command-line arguments.')
-            tf.compat.v1.saved_model.signature_constants
-            .DEFAULT_SERVING_SIGNATURE_DEF_KEY:
+  export_path = FLAGS.export_path
-                signature_def
+  if os.path.exists(export_path):
-        },
+    raise ValueError('export_path %s already exists.' % export_path)
-        main_op=init_op)
-    builder.save()
+  if FLAGS.input_scales_list is None:
+    input_scales_tensor = None
+  else:
+    input_scales_tensor = tf.constant(
+        [float(s) for s in FLAGS.input_scales_list],
+        dtype=tf.float32,
+        shape=[len(FLAGS.input_scales_list)],
+        name='input_scales')
+  module = _ExtractModule(FLAGS.multi_scale_pool_type,
+                          FLAGS.normalize_global_descriptor,
+                          input_scales_tensor)
+  # Load the weights.
+  checkpoint_path = FLAGS.ckpt_path
+  module.LoadWeights(checkpoint_path)
+  print('Checkpoint loaded from ', checkpoint_path)
+  # Save the module
+  if FLAGS.input_scales_list is None:
+    served_function = module.ExtractFeatures
+  else:
+    served_function = module.ExtractFeaturesFixedScales
+  tf.saved_model.save(
+      module, export_path, signatures={'serving_default': served_function})
 if __name__ == '__main__':

--- a/research/delf/delf/python/training/model/export_model_utils.py
+++ b/research/delf/delf/python/training/model/export_model_utils.py
@@ -172,8 +172,10 @@ def ExtractLocalFeatures(image, image_scales, max_feature_num, abs_thres, iou,
          final_boxes.get_field('scores'), 1)
+@tf.function
 def ExtractGlobalFeatures(image,
                          image_scales,
+                          global_scales_ind,
                          model_fn,
                          multi_scale_pool_type='None',
                          normalize_global_descriptor=False):
@@ -183,6 +185,8 @@ def ExtractGlobalFeatures(image,
    image: image tensor of type tf.uint8 with shape [h, w, channels].
    image_scales: 1D float tensor which contains float scales used for image
      pyramid construction.
+    global_scales_ind: Feature extraction happens only for a subset of
+      `image_scales`, those with corresponding indices from this tensor.
    model_fn: model function. Follows the signature:
      * Args:
        * `images`: Image tensor which is re-scaled.
@@ -204,59 +208,45 @@ def ExtractGlobalFeatures(image,
  """
  original_image_shape_float = tf.gather(
      tf.dtypes.cast(tf.shape(image), tf.float32), [0, 1])
  image_tensor = gld.NormalizeImages(
      image, pixel_value_offset=128.0, pixel_value_scale=128.0)
  image_tensor = tf.expand_dims(image_tensor, 0, name='image/expand_dims')
-  def _ProcessSingleScale(scale_index, global_descriptors=None):
+  def _ResizeAndExtract(scale_index):
-    """Resizes the image and runs feature extraction.
+    """Helper function to resize image then extract global feature.
-       This function will be passed into tf.while_loop() and be called
-       repeatedly. We get the current scale by image_scales[scale_index], and
-       run image resizing / feature extraction. In the end, we concat the
-       previous global descriptors with current descriptor as the output.
    Args:
      scale_index: A valid index in image_scales.
-      global_descriptors: Global descriptor tensor with the shape of [S, D]. If
-        None, no previous global descriptors are used, and the output will be of
-        shape [1, D].
    Returns:
-      scale_index: The next scale index for processing.
+      global_descriptor: [1,D] tensor denoting the extracted global descriptor.
-      global_descriptors: A concatenated global descriptor tensor with the shape
-        of [S+1, D].
    """
    scale = tf.gather(image_scales, scale_index)
    new_image_size = tf.dtypes.cast(
        tf.round(original_image_shape_float * scale), tf.int32)
    resized_image = tf.image.resize(image_tensor, new_image_size)
    global_descriptor = model_fn(resized_image)
-    if global_descriptors is None:
+    return global_descriptor
-      global_descriptors = global_descriptor
-    else:
-      global_descriptors = tf.concat([global_descriptors, global_descriptor], 0)
-    return scale_index + 1, global_descriptors
+  # First loop to find initial scale to be used.
-  # Process the first scale separately, the following scales will reuse the
-  # graph variables.
-  (_, output_global) = _ProcessSingleScale(0)
-  i = tf.constant(1, dtype=tf.int32)
  num_scales = tf.shape(image_scales)[0]
-  keep_going = lambda j, g: tf.less(j, num_scales)
+  initial_scale_index = tf.constant(-1, dtype=tf.int32)
+  for scale_index in tf.range(num_scales):
-  (_, output_global) = tf.nest.map_structure(
+    if tf.reduce_any(tf.equal(global_scales_ind, scale_index)):
-      tf.stop_gradient,
+      initial_scale_index = scale_index
-      tf.while_loop(
+      break
-          cond=keep_going,
-          body=_ProcessSingleScale,
+  output_global = _ResizeAndExtract(initial_scale_index)
-          loop_vars=[i, output_global],
-          shape_invariants=[i.get_shape(),
+  # Loop over subsequent scales.
-                            tf.TensorShape([None, None])]))
+  for scale_index in tf.range(initial_scale_index + 1, num_scales):
+    # Allow an undefined number of global feature scales to be extracted.
+    tf.autograph.experimental.set_loop_options(
+        shape_invariants=[(output_global, tf.TensorShape([None, None]))])
+    if tf.reduce_any(tf.equal(global_scales_ind, scale_index)):
+      global_descriptor = _ResizeAndExtract(scale_index)
+      output_global = tf.concat([output_global, global_descriptor], 0)
  normalization_axis = 1
  if multi_scale_pool_type == 'average':