Added hyperparameter guidelines for DELF training (#9219)

* Merged commit includes the following changes: 326369548 by Andre Araujo: Fix import issues. -- 326159826 by Andre Araujo: Changed the implementation of the cosine weights from Keras layer to tf.Variable to manually control for L2 normalization. -- 326139082 by Andre Araujo: Support local feature matching using ratio test. To allow for easily choosing which matching type to use, we rename a flag/argument and modify all related files to avoid breakages. Also include a small change when computing nearest neighbors for geometric matching, to parallelize computation, which saves a little bit of time during execution (argument "n_jobs=-1"). -- 326119848 by Andre Araujo: Option to measure DELG latency taking binarization into account. -- 324316608 by Andre Araujo: DELG global features training. -- 323693131 by Andre Araujo: PY3 conversion for delf public lib. -- 32104...

Added hyperparameter guidelines for DELF training (#9219)
* Merged commit includes the following changes: 326369548 by Andre Araujo: Fix import issues. -- 326159826 by Andre Araujo: Changed the implementation of the cosine weights from Keras layer to tf.Variable to manually control for L2 normalization. -- 326139082 by Andre Araujo: Support local feature matching using ratio test. To allow for easily choosing which matching type to use, we rename a flag/argument and modify all related files to avoid breakages. Also include a small change when computing nearest neighbors for geometric matching, to parallelize computation, which saves a little bit of time during execution (argument "n_jobs=-1"). -- 326119848 by Andre Araujo: Option to measure DELG latency taking binarization into account. -- 324316608 by Andre Araujo: DELG global features training. -- 323693131 by Andre Araujo: PY3 conversion for delf public lib. -- 32104...
3a18c6ab · Dan Anghel · GitHub · 23e26542 · 3a18c6ab
Unverified Commit 3a18c6ab authored Sep 09, 2020 by Dan Anghel Committed by GitHub Sep 09, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 12 additions and 7 deletions

research/delf/delf/python/training/README.md research/delf/delf/python/training/README.md +12 -7

No files found.
--- a/research/delf/delf/python/training/README.md
+++ b/research/delf/delf/python/training/README.md
@@ -143,6 +143,8 @@ curl -Os http://storage.googleapis.com/delf/resnet50_imagenet_weights.tar.gz
 tar -xzvf resnet50_imagenet_weights.tar.gz
 ```
+### Training with Local Features
 Assuming the TFRecord files were generated in the `gldv2_dataset/tfrecord/`
 directory, running the following command should start training a model and
 output the results in the `gldv2_training` directory:
@@ -156,13 +158,7 @@ python3 train.py \
  --logdir=gldv2_training/
 ```
-On a multi-GPU machine the batch size can be increased to speed up the training
+### Training with Local and Global Features
-using the `--batch_size` parameter. On a 8 Tesla P100 GPUs machine you can set
-the batch size to `256`:
-```
--batch_size=256
-```
 It is also possible to train the model with an improved global features head as
 introduced in the [DELG paper](https://arxiv.org/abs/2001.05027). To do this,
@@ -179,6 +175,15 @@ python3 train.py \
  --delg_global_features
 ```
+### Hyperparameter Guidelines
+In order to improve the convergence of the training, the following
+hyperparameter values have been tested and validated on the following
+infrastructures, the remaining `train.py` flags keeping their **default 
+values**:
+* 8 Tesla P100 GPUs: `--batch_size=256`, `--initial_lr=0.01`
+* 4 Tesla P100 GPUs: `--batch_size=128`, `--initial_lr=0.005`
 *NOTE*: We are currently working on adding the autoencoder described in the DELG
 paper to this codebase. Currently, it is not yet implemented here. Stay tuned!