• Ayush Dubey's avatar
    Multi-worker support for Resnet. (#6206) · f2e90945
    Ayush Dubey authored
    * Update official resnet for multi worker training with distribution strategies.
    
    * Fixes for multi worker training.
    
    * Fix call to `get_distribution_strategy`.
    
    * Undo test change.
    
    * Fix spacing.
    
    * Move cluster configuration to distribution_utils.
    
    * Move train_and_evaluate out of loop.  Also, update docstrings for multi-worker flags and add use_train_and_evaluate flag.
    
    * Update distribution_strategy flag to match exported name for collective strategy.
    f2e90945
distribution_utils.py 8.68 KB