Update to the DELF training README (#8759)

* First version of working script to download the GLDv2 dataset * First version of the DEFL package installation script * First working version of the DELF package installation script * Fixed feedback from PR review * Push to Github of changes to the TFRecord data generation script for DELF. * Merged commit includes the following changes: 315363544 by Andre Araujo: Added the generation of TRAIN and VALIDATE splits from the train dataset. -- 314676530 by Andre Araujo: Updated script to download GLDv2 images for DELF training. -- 314101235 by Andre Araujo: Added newly created module 'utils' to the copybara script. -- 313677085 by Andre Araujo: Code migration from TF1 to TF2 for: - logging (replaced usage of tf.compat.v1.logging.info) - testing directories (replaced usage of tf.compat.v1.test.get_temp_dir()) - feature/object extraction scripts (replaced usage of tf.compat.v1.train.string_input_producer and tf.compat.v1.train.start_queue_runners with PIL) -- 312770828 by Andre Araujo: Internal change. -- PiperOrigin-RevId: 315363544 * First version of the updated README of the DELF training instructions * Added to the README the section describing the generation of the training data * Added warning about the TFRecord generation time * Updated the launch of the training * Minor README update * Integrated review feedback * Merged commit includes the following changes: 315971979 by Andre Araujo: Performance optimization in generating the TRAIN and VALIDATION splits per label. -- 315578370 by Andre Araujo: Tiny fix to char limit in extractor.py. -- 315546242 by Andre Araujo: Script to measure DELG latency. -- 315545801 by Andre Araujo: Pre-load PCA parameters, if using them when extracting DELF/G features. -- 315450392 by Andre Araujo: Code migration from TF1 to TF2 for: - loading the models using in extractor.py and detector.py using tf.saved_model.load - removed tf.compat.v1.Session for the extractor and detector model usage -- 315406342 by Andre Araujo: Internal change. -- PiperOrigin-RevId: 315971979 * Merged commit includes the following changes: 316538447 by Andre Araujo: Read the number of classes from the GLDv2 dataset metadata. -- 316416973 by Andre Araujo: Migration of DELF code to TF2: - removed tf.compat.v1.test.get_temp_dir() with FLAGS.test_tmpdir - removed delf_v1.py and its dependencies - removed tf.compat.v1, Session, Graph dependencies from feature_extractor.py, feature_aggregation_extractor.py and aggregation_extraction.py -- PiperOrigin-RevId: 316538447 * Removed reference to delf_v1 * Merged commit includes the following changes: 318168500 by Andre Araujo: Several small changes to DELF open-source training code: - Replace "make_dataset_iterator" call which was deprecated by a more recent suitable version. - Add image summary, allowing visualization of the augmented images during training - Normalize images before feeding them to the model -- 316888714 by Andre Araujo: - Removed unnecessary cast from feature_aggregation_extraction.py - Fixed clustering script -- PiperOrigin-RevId: 318168500 * Merged commit includes the following changes: 318401984 by Andre Araujo: Add attention visualization to DELF training script. -- PiperOrigin-RevId: 318401984 * README update with training and model validation steps * Minor fixes to the DEFL training README * Integrated review feedback * Changed passing of boolean parameter to script Co-authored-by: Andre Araujo <andrearaujo@google.com>

Update to the DELF training README (#8759)
* First version of working script to download the GLDv2 dataset * First version of the DEFL package installation script * First working version of the DELF package installation script * Fixed feedback from PR review * Push to Github of changes to the TFRecord data generation script for DELF. * Merged commit includes the following changes: 315363544 by Andre Araujo: Added the generation of TRAIN and VALIDATE splits from the train dataset. -- 314676530 by Andre Araujo: Updated script to download GLDv2 images for DELF training. -- 314101235 by Andre Araujo: Added newly created module 'utils' to the copybara script. -- 313677085 by Andre Araujo: Code migration from TF1 to TF2 for: - logging (replaced usage of tf.compat.v1.logging.info) - testing directories (replaced usage of tf.compat.v1.test.get_temp_dir()) - feature/object extraction scripts (replaced usage of tf.compat.v1.train.string_input_producer and tf.compat.v1.train.start_queue_runners with PIL) -- 312770828 by Andre Araujo: Internal change. -- PiperOrigin-RevId: 315363544 * First version of the updated README of the DELF training instructions * Added to the README the section describing the generation of the training data * Added warning about the TFRecord generation time * Updated the launch of the training * Minor README update * Integrated review feedback * Merged commit includes the following changes: 315971979 by Andre Araujo: Performance optimization in generating the TRAIN and VALIDATION splits per label. -- 315578370 by Andre Araujo: Tiny fix to char limit in extractor.py. -- 315546242 by Andre Araujo: Script to measure DELG latency. -- 315545801 by Andre Araujo: Pre-load PCA parameters, if using them when extracting DELF/G features. -- 315450392 by Andre Araujo: Code migration from TF1 to TF2 for: - loading the models using in extractor.py and detector.py using tf.saved_model.load - removed tf.compat.v1.Session for the extractor and detector model usage -- 315406342 by Andre Araujo: Internal change. -- PiperOrigin-RevId: 315971979 * Merged commit includes the following changes: 316538447 by Andre Araujo: Read the number of classes from the GLDv2 dataset metadata. -- 316416973 by Andre Araujo: Migration of DELF code to TF2: - removed tf.compat.v1.test.get_temp_dir() with FLAGS.test_tmpdir - removed delf_v1.py and its dependencies - removed tf.compat.v1, Session, Graph dependencies from feature_extractor.py, feature_aggregation_extractor.py and aggregation_extraction.py -- PiperOrigin-RevId: 316538447 * Removed reference to delf_v1 * Merged commit includes the following changes: 318168500 by Andre Araujo: Several small changes to DELF open-source training code: - Replace "make_dataset_iterator" call which was deprecated by a more recent suitable version. - Add image summary, allowing visualization of the augmented images during training - Normalize images before feeding them to the model -- 316888714 by Andre Araujo: - Removed unnecessary cast from feature_aggregation_extraction.py - Fixed clustering script -- PiperOrigin-RevId: 318168500 * Merged commit includes the following changes: 318401984 by Andre Araujo: Add attention visualization to DELF training script. -- PiperOrigin-RevId: 318401984 * README update with training and model validation steps * Minor fixes to the DEFL training README * Integrated review feedback * Changed passing of boolean parameter to script Co-authored-by: Andre Araujo <andrearaujo@google.com>
04c0409c · Dan Anghel · GitHub · 4b46ab20 · 04c0409c · 04c0409c
Unverified Commit 04c0409c authored Jun 30, 2020 by Dan Anghel Committed by GitHub Jun 30, 2020
Showing with 97 additions and 2 deletions

research/delf/delf/python/training/README.md research/delf/delf/python/training/README.md +97 -2

research/delf/delf/python/training/matched_images_demo.png research/delf/delf/python/training/matched_images_demo.png +0 -0

No files found.
--- a/research/delf/delf/python/training/README.md
+++ b/research/delf/delf/python/training/README.md
@@ -118,11 +118,106 @@ files can take up to 12 hours and up to 500 GB of space disk.*

 ## Running the Training

-Assuming the TFRecord files were generated in the `gldv2_dataset/tfrecord/` directory, running 
-the following command should start training a model:
+For the training to converge faster, it is possible to initialize the ResNet backbone with the
+weights of a pretrained ImageNet model. The ImageNet checkpoint is available at the following
+location: [`http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz`](http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz).
+To download and unpack it run the following commands on a Linux box:
+```
+curl -Os http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz
+tar -xzvf resnet50_imagenet_weights.tar.gz
+```

+Assuming the TFRecord files were generated in the `gldv2_dataset/tfrecord/` directory, running 
+the following command should start training a model and output the results in the `gldv2_training`
+directory:
 ```
 python3 train.py \
  --train_file_pattern=gldv2_dataset/tfrecord/train* \
  --validation_file_pattern=gldv2_dataset/tfrecord/validation*
+  --imagenet_checkpoint=resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 \
+  --dataset_version=gld_v2_clean \
+  --logdir=gldv2_training/
+```
+
+On a multi-GPU machine the batch size can be increased to speed up the training using the `--batch_size` parameter. On a 8 Tesla P100 GPUs machine you can set the batch size to `256`:
+```
+--batch_size=256
+```
+
+## Exporting the Trained Model
+
+Assuming the training output, the TensorFlow checkpoint, is in the `gldv2_training` directory,
+running the following command exports the model in the `gldv2_model` directory:
+```
+python3 model/export_model.py \
+  --ckpt_path=gldv2_training/delf_weights \
+  --export_path=gldv2_model \
+  --block3_strides
 ```
+
+## Testing the Trained Model
+
+After the trained model has been exported, it can be used to extract DELF features from 2 images 
+of the same landmark and to perform a matching test between the 2 images based on the extracted 
+features to validate they represent the same landmark.
+
+Start by downloading the Oxford buildings dataset:
+````
+mkdir data && cd data
+wget http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/oxbuild_images.tgz
+mkdir oxford5k_images oxford5k_features
+tar -xvzf oxbuild_images.tgz -C oxford5k_images/
+cd ../
+echo data/oxford5k_images/hertford_000056.jpg >> list_images.txt
+echo data/oxford5k_images/oxford_000317.jpg >> list_images.txt
+````
+
+Make a copy of the [`delf_config_example.pbtxt`](../examples/delf_config_example.pbtxt) 
+protobuffer file which configures the DELF feature extraction. Update the file by making the
+following changes:
+* set the `model_path` attribute to the directory containing the exported model, `gldv2_model`
+  in this example
+* add at the root level the attribute `is_tf2_exported` with the value `true`
+* set to `false` the `use_pca` attribute inside `delf_local_config`
+
+The ensuing file should resemble the following:
+```
+model_path: "gldv2_model"
+image_scales: .25
+image_scales: .3536
+image_scales: .5
+image_scales: .7071
+image_scales: 1.0
+image_scales: 1.4142
+image_scales: 2.0
+is_tf2_exported: true
+delf_local_config {
+  use_pca: false
+  max_feature_num: 1000
+  score_threshold: 100.0
+}
+```
+
+Run the following command to extract DELF features for the images `hertford_000056.jpg` and
+`oxford_000317.jpg`:
+```
+python3 ../examples/extract_features.py \
+  --config_path delf_config_example.pbtxt \
+  --list_images_path list_images.txt \
+  --output_dir data/oxford5k_features
+```
+
+Run the following command to perform feature matching between the images `hertford_000056.jpg` 
+and `oxford_000317.jpg`:
+```
+python3 ../examples/match_images.py \
+  --image_1_path data/oxford5k_images/hertford_000056.jpg \
+  --image_2_path data/oxford5k_images/oxford_000317.jpg \
+  --features_1_path data/oxford5k_features/hertford_000056.delf \
+  --features_2_path data/oxford5k_features/oxford_000317.delf \
+  --output_image matched_images.png
+```
+
+The generated image `matched_images.png` should look similar to this one:
+
+![MatchedImagesDemo](./matched_images_demo.png)
--- a/research/delf/delf/python/training/matched_images_demo.png
+++ b/research/delf/delf/python/training/matched_images_demo.png