- 21 Jul, 2019 1 commit
-
-
Zongwei Zhou authored
-
- 20 Jul, 2019 1 commit
-
-
Toby Boyd authored
-
- 19 Jul, 2019 2 commits
-
-
guptapriya authored
The current approach checks for presence of contrib. Sometimes this is not sufficient (for e..g when testing TF 1 + enable_v2_behavior=True which is what internal tests currently do)
- 18 Jul, 2019 1 commit
-
-
Toby Boyd authored
* Added benchmarks and common flags. * Add cpu tests. * Add tracking epoch times. * fix transformer. * Add examples_per_second. * fix pylint
-
- 11 Jul, 2019 1 commit
-
-
Toby Boyd authored
* Move to global_step. * Hook to use global_step. * fix comment start step 1 not step 0. * remove hack used for testing. * Add docstring.
-
- 03 Jul, 2019 1 commit
-
-
Toby Boyd authored
* Fix unit tests failures. * 96% of TF 2.0 tests on GPU are passing. * Currently all passing GPU and CPU TF 2.0 * Address code comments. * use tf 2.0 cast. * Comment about working on TF 2.0 CPU * Uses contrib turn off for TF 2.0. * Fix wide_deep and add keras_common_tests. * use context to get num_gpus. * Switch to tf.keras.metrics
-
- 02 Jul, 2019 1 commit
-
-
Yuefeng Zhou authored
when there are multiple workers.
-
- 19 Jun, 2019 1 commit
-
-
Toby Boyd authored
* set default steps to 300K. * Log flags to perfzero. * Add XLA support to transformer - Moved config logic to keras_utils - Added enable_xla flag to _performance flags - Did not refactor enable_xla flag from keras resnet due to reliance on calling FLAGs in estimator keras and that is a needed refactor for another time. * fix g3 lint complaint. * Refactor set config into keras_utils. * Move flags out of main. * pipe through enable_xla * Update official/transformer/v2/misc.py Co-Authored-By:Reed <reedwm@google.com>
-
- 24 May, 2019 1 commit
-
-
Toby Boyd authored
-
- 29 Apr, 2019 1 commit
-
-
Igor authored
Replace per_device with per_replica and PerDevice with PerReplica, because the PerDevice concept was renamed and doesn't exist anymore. (#6693) * Replace per_device with per_replica and PerDevice with PerReplica, because the PerReplica concept was renamed and doesn't exist anymore.
-
- 26 Apr, 2019 1 commit
-
-
Ayush Dubey authored
* Add num_packs flag for MirroredStrategy's cross device ops. * fix parens * Fix lint errors and make all_reduce_alg more robust. * Set default num_packs to 1
-
- 25 Apr, 2019 1 commit
-
-
Ayush Dubey authored
* Remove contrib AllReduceCrossDeviceOps and update all_reduce_alg options with MirroredStrategy. * cleanup
-
- 24 Apr, 2019 1 commit
-
-
Yuefeng Zhou authored
-
- 11 Apr, 2019 1 commit
-
-
rxsang authored
* Make BatchTimestamp object printable. * Removing trailing whitespace. * Make BatchTimestamp repr a string.
-
- 08 Apr, 2019 1 commit
-
-
Shining Sun authored
* add ds support for ncf * remove comments for in_top_k * avoid expanding the input layers * resolve comments and fix lint * Added some comments in code and fix lint * fix lint * add some documentation * add tensorflow imports
-
- 01 Apr, 2019 1 commit
-
-
Haoyu Zhang authored
-
- 29 Mar, 2019 1 commit
-
-
Shining Sun authored
-
- 28 Mar, 2019 1 commit
-
-
Shining Sun authored
* initial commit * bug fix * Move build_stats from common to keras main, because it is only applicable in keras * remove tailing blank line * add test for synth data * add kwargs to init * add kwargs to function invokation * correctly pass kwargs * debug * debug * debug * fix super init * bug fix * fix local_flags * fix import * bug fix * fix log_steps flag * bug fix * bug fix: add missing return value * resolve double-defined flags * lint fix * move log_steps flag to benchmarK flag * fix lint * lint fix * lint fix * try flag core default values * bug fix * bug fix * bug fix * debug * debug * remove debug prints * rename benchmark methods * flag bug fix for synth benchmark
-
- 19 Mar, 2019 1 commit
-
-
Soroush Radpour authored
-
- 07 Mar, 2019 1 commit
-
-
Ayush Dubey authored
* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy * More s/contrib.distribute/distribute.experimental * Collective communication options in MultiWorkerMirroredStrategy. * Minor fixes * No checkpointing if multi worker. * turn off checkpointing * fix lint
-
- 02 Mar, 2019 1 commit
-
-
Taylor Robie authored
* fix resnet breakage and add keras end-to-end tests * delint * address PR comments
-
- 01 Mar, 2019 1 commit
-
-
Shining Sun authored
* tmp commit * tmp commit * first attempt (without eval) * Bug fixes * bug fixes * training done * Loss NAN, no eval * Loss weight problem solved * resolve the NAN loss problem * Problem solved. Clean up needed * Added a todo * Remove debug prints * Extract get_optimizer to ncf_common * Move metrics computation back to neumf; use DS.scope api * Extract DS.scope code to utils * lint fixes * Move obtaining DS above producer.start to avoid race condition * move pt 1 * move pt 2 * Update the run script * Wrap keras_model related code into functions * Update the doc for softmax_logitfy and change the method name * Resolve PR comments * working version with: eager, DS, batch and no masks * Remove git conflict indicator * move reshape to neumf_model * working version, not converge * converged * fix a test * more lint fix * more lint fix * more lint fixes * more lint fix * Removed unused imports * fix test * dummy commit for kicking of checks * fix lint issue * dummy input to kick off checks * dummy input to kick off checks * add collective to dist strat * addressed review comments * add a doc string
-
- 28 Feb, 2019 2 commits
-
-
Ayush Dubey authored
* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy * More s/contrib.distribute/distribute.experimental
-
Tayo Oguntebi authored
-
- 21 Feb, 2019 1 commit
-
-
Ayush Dubey authored
* Update official resnet for multi worker training with distribution strategies. * Fixes for multi worker training. * Fix call to `get_distribution_strategy`. * Undo test change. * Fix spacing. * Move cluster configuration to distribution_utils. * Move train_and_evaluate out of loop. Also, update docstrings for multi-worker flags and add use_train_and_evaluate flag. * Update distribution_strategy flag to match exported name for collective strategy.
-
- 14 Feb, 2019 1 commit
-
-
Toby Boyd authored
* One device from contrib to core. * remove test code.
-
- 13 Feb, 2019 1 commit
-
-
Yuefeng Zhou authored
* Add a flag to specify distribution strategies. * Fix a small error. * Address comments. * Address comments. * Fix typos.
-
- 12 Feb, 2019 1 commit
-
-
Toby Boyd authored
* Remove contrib thread pool. * Remove commented out contrib import. * Fix lint issues. * move tf.data.options higher. Tweak line breaks. * do not monkey patch on or off if dist_strat is off * Do not monkey patch if no_dist_strat. * Fix file permissions. * fix file permissions. * Revert change to main. Add hasattr(tf, 'contrib') to utils * compat.v1.logging * tf.compat.v1.get_local_variables.
-
- 11 Feb, 2019 1 commit
-
-
Toby Boyd authored
* Remove contrib thread pool. * Remove commented out contrib import. * Fix lint issues. * move tf.data.options higher. Tweak line breaks.
-
- 09 Feb, 2019 1 commit
-
-
Yuefeng Zhou authored
* Add pure synthetic data to keras resnet mode. * Add imports. * Address comments. * update comment * Undo set up synthetic data for real data path. * update comment * Address comment * Remove trailing whiltespaces. * s/make_data_set_iterator/make_dataset_iterator/
-
- 08 Feb, 2019 1 commit
-
-
Goldie Gadde authored
This reverts commit 57e07520.
-
- 06 Feb, 2019 1 commit
-
-
Goldie Gadde authored
This reverts commit d6b2b83c.
-
- 05 Feb, 2019 1 commit
-
-
Goldie Gadde authored
* Add resnet56 short tests. (#6101) * Add resnet56 short tests. - created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor * Address feedback. * Move flag_methods to init - Address setting default flags repeatedly. * Rename accuracy tests. * Lint errors resolved. * fix model_dir set to flags.data_dir. * fixed not fulling pulling out flag_methods. * Use core mirrored strategy in official models (#6126) * Imagenet short tests (#6132) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * points default data_dir to child folder. (#6131) Failed test is python2 and was a kokoro failure * Imagenet short tests (#6136) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * Add fill_objects * fixed calling wrong class in super. * fix lint issue. * Flag (#6121) * Fix the turn_off_ds flag problem * add param names to all args * Export benchmark stats using tf.test.Benchmark.report_benchmark() (#6103) * Export benchmark stats using tf.test.Benchmark.report_benchmark() * Fix python style using pyformat * Typos. (#6120) * log verbosity=2 logs every epoch no progress bars (#6142) * tf_upgrade_v2 on resnet and utils folder. * tf_upgrade_v2 on resnet and utils folder.
-
- 01 Feb, 2019 1 commit
-
-
guptapriya authored
-
- 27 Dec, 2018 1 commit
-
-
Shining Sun authored
-
- 24 Dec, 2018 1 commit
-
-
Toby Boyd authored
-
- 21 Dec, 2018 1 commit
-
-
Shining Sun authored
-
- 20 Dec, 2018 2 commits
-
-
Shining Sun authored
-
Shining Sun authored
-