. The model training method is nearly identical to that described in the
*Before proceeding* please read the [Convolutional Neural Networks](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial; in
particular, focus on [Training a Model Using Multiple GPU Cards](https://www.tensorflow.org/tutorials/deep_cnn/index.html#launching_and_training_the_model_on_multiple_gpu_cards). The model training method is nearly identical to that described in the
CIFAR-10 multi-GPU model training. Briefly, the model training
* Places an individual model replica on each GPU. Split the batch across the
GPUs.
* Places an individual model replica on each GPU.
*Splits the batch across the GPUs.
* Updates model parameters synchronously by waiting for all GPUs to finish
processing a batch of data.
...
...
@@ -245,11 +242,9 @@ We term each machine that maintains model parameters a `ps`, short for
`ps` as the model parameters may be sharded across multiple machines.
Variables may be updated with synchronous or asynchronous gradient updates. One
may construct a an [`Optimizer`]
(https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
that constructs the necessary graph for either case diagrammed below from