Explicit broadcast in image normalization for better performance (#6551)

With trivial model, it improves the data input pipeline throughput from 12.5K to 15K on a DGX1 V100 machine.

Explicit broadcast in image normalization for better performance (#6551)
With trivial model, it improves the data input pipeline throughput from 12.5K to 15K on a DGX1 V100 machine.
6f068c71 · Haoyu Zhang · Toby Boyd · 1255d5b9 · 6f068c71
Commit 6f068c71 authored Apr 09, 2019 by Haoyu Zhang Committed by Toby Boyd Apr 09, 2019
Show whitespace changes
Inline Side-by-side

Showing with 3 additions and 1 deletion

official/resnet/imagenet_preprocessing.py official/resnet/imagenet_preprocessing.py +3 -1

No files found.
--- a/official/resnet/imagenet_preprocessing.py
+++ b/official/resnet/imagenet_preprocessing.py
@@ -148,7 +148,9 @@ def _mean_image_subtraction(image, means, num_channels):
    raise ValueError('len(means) must match the number of channels')
  # We have a 1-D tensor of means; convert to 3-D.
-  means = tf.expand_dims(tf.expand_dims(means, 0), 0)
+  # Note(b/130245863): we explicitly call `broadcast` instead of simply
+  # expanding dimensions for better performance.
+  means = tf.broadcast_to(means, tf.shape(image))
  return image - means