1. 14 Sep, 2018 1 commit
  2. 11 Sep, 2018 2 commits
  3. 05 Sep, 2018 4 commits
  4. 04 Sep, 2018 1 commit
  5. 02 Sep, 2018 2 commits
  6. 01 Sep, 2018 2 commits
  7. 30 Aug, 2018 1 commit
  8. 29 Aug, 2018 1 commit
  9. 28 Aug, 2018 2 commits
  10. 27 Aug, 2018 2 commits
    • Taylor Robie's avatar
      ResNet eval_only mode (#5186) · d1c48afc
      Taylor Robie authored
      * Make ResNet robust to the case that epochs_between_evals does not divide train_epochs, and add an --eval_only option
      
      * add some comments to make the control flow easier to follow
      
      * address PR comments
      d1c48afc
    • Toby Boyd's avatar
      Add 5 epoch warmup to resnet (#5176) · 9bf586de
      Toby Boyd authored
      * Add 5 epoch warmup
      
      * get_lr with warm_up only for imagenet
      
      * Add base_lr, remove fp16 unittest arg validation
      
      * Remove validation check stopping v1 and FP16
      9bf586de
  11. 25 Aug, 2018 1 commit
  12. 22 Aug, 2018 1 commit
    • Reed's avatar
      Fix convergence issues for MLPerf. (#5161) · 64710c05
      Reed authored
      * Fix convergence issues for MLPerf.
      
      Thank you to @robieta for helping me find these issues, and for providng an algorithm for the `get_hit_rate_and_ndcg_mlperf` function.
      
      This change causes every forked process to set a new seed, so that forked processes do not generate the same set of random numbers. This improves evaluation hit rates.
      
      Additionally, it adds a flag, --ml_perf, that makes further changes so that the evaluation hit rate can match the MLPerf reference implementation.
      
      I ran 4 times with --ml_perf and 4 times without. Without --ml_perf, the highest hit rates achieved by each run were 0.6278, 0.6287, 0.6289, and 0.6241. With --ml_perf, the highest hit rates were 0.6353, 0.6356, 0.6367, and 0.6353.
      
      * fix lint error
      
      * Fix failing test
      
      * Address @robieta's feedback
      
      * Address more feedback
      64710c05
  13. 20 Aug, 2018 1 commit
  14. 18 Aug, 2018 1 commit
    • Reed's avatar
      Speed up cache construction. (#5131) · 5aee67b4
      Reed authored
      This is done by using a higher Pickle protocol version, which the Python docs describe as being "slightly more efficient". This reduces the file write time at the beginning from 2 1/2 minutes to 5 seconds.
      5aee67b4
  15. 16 Aug, 2018 2 commits
  16. 15 Aug, 2018 1 commit
  17. 14 Aug, 2018 2 commits
    • alope107's avatar
      Transformer partial fix (#5092) · 6f5967a0
      alope107 authored
      * Fix Transformer TPU crash in Python 2.X.
      
      - Tensorflow raises an error when tf_inspect.getfullargspec is called on
      a functools.partial in Python 2.X. This issue would be hit during the
      eval stage of the Transformer TPU model. This change replaces the call
      to functools.partial with a lambda to work around the issue.
      
      * Remove unused import from transformer_main.
      
      * Fix lint error.
      6f5967a0
    • Zac Wellmer's avatar
      Resnet transfer learning (#5047) · 7bffd37b
      Zac Wellmer authored
      * warm start a resent with all but the dense layer and only update the final layer weights when fine tuning
      
      * Update README for Transfer Learning
      
      * make lint happy and variable naming error related to scaled gradients
      
      * edit the test cases for cifar10 and imagenet to reflect the default case of no fine tuning
      7bffd37b
  18. 13 Aug, 2018 1 commit
  19. 10 Aug, 2018 1 commit
  20. 02 Aug, 2018 2 commits
  21. 01 Aug, 2018 1 commit
  22. 31 Jul, 2018 8 commits