1. 17 Mar, 2020 1 commit
  2. 05 Mar, 2020 1 commit
  3. 02 Mar, 2020 1 commit
  4. 25 Feb, 2020 1 commit
  5. 27 Nov, 2019 1 commit
  6. 28 Oct, 2019 1 commit
  7. 21 Oct, 2019 1 commit
  8. 16 Oct, 2019 1 commit
    • Reed Wanderman-Milne's avatar
      Add support for the tf.keras.mixed_precision API in NCF · cb913691
      Reed Wanderman-Milne authored
      To test, I did 50 fp32 runs and 50 fp16 runs. I used the following command:
      
      python ncf_keras_main.py --dataset=ml-20m --num_gpus=1 --train_epochs=10 --clean --batch_size=99000 --learning_rate=0.00382059 --beta1=0.783529 --beta2=0.909003 --epsilon=1.45439e-7 --layers=256,256,128,64 --num_factors=64 --hr_threshold=0.635 --ml_perf --nouse_synthetic_data --data_dir ~/ncf_data_dir_python3 --model_dir ~/tmp_model_dir --keras_use_ctl
      
      For the fp16 runs, I added --dtype=fp16. The average hit-rate for both fp16 and fp32 was 0.6365. I also did 50 runs with the mixed precision graph rewrite, and the average hit-rate was 0.6363. The difference is likely due to noise.
      
      PiperOrigin-RevId: 275059871
      cb913691
  9. 07 Oct, 2019 1 commit
  10. 09 Sep, 2019 1 commit
  11. 04 Sep, 2019 1 commit
  12. 30 Aug, 2019 1 commit
  13. 26 Aug, 2019 1 commit
  14. 23 Aug, 2019 1 commit
  15. 20 Aug, 2019 2 commits
  16. 19 Aug, 2019 1 commit
    • Reed Wanderman-Milne's avatar
      Do not expose --max_train_steps in models that do not use it. · 824ff2d6
      Reed Wanderman-Milne authored
      Only the V1 resnet model uses --max_train_steps. This unexposes the flag in the keras_application_models, mnist, keras resnet, CTL resnet Models. Before this change, such models allowed the flag to be specified, but ignored it.
      
      I also removed the "max_train" argument from the run_synthetic function, since this only had any meaning for the V1 resnet model. Instead, the V1 resnet model now directly passes --max_train_steps=1 to run_synthetic.
      
      PiperOrigin-RevId: 264269836
      824ff2d6
  17. 16 Aug, 2019 1 commit
    • Ayush Dubey's avatar
      Add multi-worker benchmarks to Keras ResNet model. · ff6c3b1e
      Ayush Dubey authored
      Also add `worker_hosts` and `task_index` flags.  These flags enable running the
      model over multiple hosts by passing the cluster information via command line.
      
      Setting `TF_CONFIG` will continue to work.
      
      PiperOrigin-RevId: 263825245
      ff6c3b1e
  18. 06 Aug, 2019 1 commit
  19. 23 Jul, 2019 1 commit
  20. 21 Jun, 2019 2 commits
  21. 19 Jun, 2019 1 commit
    • Toby Boyd's avatar
      Add XLA to transformer (#7048) · 269581dc
      Toby Boyd authored
      
      
      * set default steps to 300K.
      
      * Log flags to perfzero.
      
      * Add XLA support to transformer
      
      - Moved config logic to keras_utils
      - Added enable_xla flag to _performance flags
      - Did not refactor enable_xla flag from keras resnet due to
        reliance on calling FLAGs in estimator keras and that is
        a needed refactor for another time.
      
      * fix g3 lint complaint.
      
      * Refactor set config into keras_utils.
      
      * Move flags out of main.
      
      * pipe through enable_xla
      
      * Update official/transformer/v2/misc.py
      Co-Authored-By: default avatarReed <reedwm@google.com>
      269581dc
  22. 06 Jun, 2019 1 commit
  23. 18 May, 2019 1 commit
  24. 15 May, 2019 1 commit
  25. 11 May, 2019 1 commit
  26. 01 May, 2019 1 commit
    • Reed's avatar
      Add --fp16_implementation option. (#6703) · b691578c
      Reed authored
      This options allows the new tf.train.experimental.enable_mixed_precision_graph_rewrite() function to be used for fp16, instead of manual casts.
      b691578c
  27. 26 Apr, 2019 2 commits
  28. 03 Apr, 2019 1 commit
  29. 28 Mar, 2019 1 commit
    • Shining Sun's avatar
      Added benchmark test and convergence test for the NCF model (#6318) · 4c11b84b
      Shining Sun authored
      * initial commit
      
      * bug fix
      
      * Move build_stats from common to keras main, because it is only applicable in keras
      
      * remove tailing blank line
      
      * add test for synth data
      
      * add kwargs to init
      
      * add kwargs to function invokation
      
      * correctly pass kwargs
      
      * debug
      
      * debug
      
      * debug
      
      * fix super init
      
      * bug fix
      
      * fix local_flags
      
      * fix import
      
      * bug fix
      
      * fix log_steps flag
      
      * bug fix
      
      * bug fix: add missing return value
      
      * resolve double-defined flags
      
      * lint fix
      
      * move log_steps flag to benchmarK flag
      
      * fix lint
      
      * lint fix
      
      * lint fix
      
      * try flag core default values
      
      * bug fix
      
      * bug fix
      
      * bug fix
      
      * debug
      
      * debug
      
      * remove debug prints
      
      * rename benchmark methods
      
      * flag bug fix for synth benchmark
      4c11b84b
  30. 20 Mar, 2019 1 commit
  31. 07 Mar, 2019 1 commit
  32. 21 Feb, 2019 1 commit
    • Ayush Dubey's avatar
      Multi-worker support for Resnet. (#6206) · f2e90945
      Ayush Dubey authored
      * Update official resnet for multi worker training with distribution strategies.
      
      * Fixes for multi worker training.
      
      * Fix call to `get_distribution_strategy`.
      
      * Undo test change.
      
      * Fix spacing.
      
      * Move cluster configuration to distribution_utils.
      
      * Move train_and_evaluate out of loop.  Also, update docstrings for multi-worker flags and add use_train_and_evaluate flag.
      
      * Update distribution_strategy flag to match exported name for collective strategy.
      f2e90945
  33. 13 Feb, 2019 1 commit
  34. 08 Feb, 2019 1 commit
  35. 06 Feb, 2019 1 commit
  36. 05 Feb, 2019 1 commit
    • Goldie Gadde's avatar
      tf_upgrade_v2 on resnet and utils folders. (#6154) · d6b2b83c
      Goldie Gadde authored
      * Add resnet56 short tests. (#6101)
      
      * Add resnet56 short tests.
      - created base benchmark module
      - renamed accuracy test class to contain the word Accuracy
      which will result in a need to update all the jobs
      and a loss of history but is worth it.
      - short tests are mostly copied from shining with oss refactor
      
      * Address feedback.
      
      * Move flag_methods to init
      - Address setting default flags repeatedly.
      
      * Rename accuracy tests.
      
      * Lint errors resolved.
      
      * fix model_dir set to flags.data_dir.
      
      * fixed not fulling pulling out flag_methods.
      
      * Use core mirrored strategy in official models (#6126)
      
      * Imagenet short tests (#6132)
      
      * Add short imagenet tests (taken from seemuch)
      - also rename to match go forward naming
      
      * fix method name
      
      * Update doc strings.
      
      * Fixe gpu number.
      
      * points default data_dir to child folder. (#6131)
      
      Failed test is python2  and was a kokoro failure
      
      * Imagenet short tests (#6136)
      
      * Add short imagenet tests (taken from seemuch)
      - also rename to match go forward naming
      
      * fix method name
      
      * Update doc strings.
      
      * Fixe gpu number.
      
      * Add fill_objects
      
      * fixed calling wrong class in super.
      
      * fix lint issue.
      
      * Flag (#6121)
      
      * Fix the turn_off_ds flag problem
      
      * add param names to all args
      
      * Export benchmark stats using tf.test.Benchmark.report_benchmark() (#6103)
      
      * Export benchmark stats using tf.test.Benchmark.report_benchmark()
      
      * Fix python style using pyformat
      
      * Typos. (#6120)
      
      * log verbosity=2 logs every epoch no progress bars (#6142)
      
      * tf_upgrade_v2 on resnet and utils folder.
      
      * tf_upgrade_v2 on resnet and utils folder.
      d6b2b83c
  37. 13 Oct, 2018 1 commit