• Katherine Wu's avatar
    Transformer multi gpu, remove multi_gpu flag, distribution helper functions (#4457) · 29c9f985
    Katherine Wu authored
    * Add DistributionStrategy to transformer model
    
    * add num_gpu flag
    
    * Calculate per device batch size for transformer
    
    * remove reference to flags_core
    
    * Add synthetic data option to transformer
    
    * fix typo
    
    * add import back in
    
    * Use hierarchical copy
    
    * address PR comments
    
    * lint
    
    * fix spaces
    
    * group train op together to fix single GPU error
    
    * Fix translate bug (sorted_keys is a dict, not a list)
    
    * Change params to a default dict (translate.py was throwing errors because params didn't have the TPU parameters.)
    
    * Address PR comments. Removed multi gpu flag + more
    
    * fix lint
    
    * fix more lints
    
    * add todo for Synthetic dataset
    
    * Update docs
    29c9f985
distribution_utils.py 2.9 KB