- 16 Aug, 2019 2 commits
-
-
Reed authored
-
Ayush Dubey authored
Also add `worker_hosts` and `task_index` flags. These flags enable running the model over multiple hosts by passing the cluster information via command line. Setting `TF_CONFIG` will continue to work. PiperOrigin-RevId: 263825245
-
- 06 Aug, 2019 1 commit
-
-
Toby Boyd authored
* force_v2_in_keras_compile FLAG default to None and added seperate temp path. * switch to force testing 1v path not force v2 path. * Rename function force_v1_path.
-
- 05 Aug, 2019 1 commit
-
-
Toby Boyd authored
-
- 02 Aug, 2019 1 commit
-
-
Haoyu Zhang authored
261339941 by haoyuzhang<haoyuzhang@google.com>: Own library functions in Keras ResNet models, and remove dependencies on v1 Estimator version of ResNet models. Most dependencies that the Keras version has are related to data input pipelines. Created dedicated files (cifar_preprocessing.py, imagenet_preprocessing.py) to collect all logic handling Cifar and ImageNet data input function. -- 261339166 by haoyuzhang<haoyuzhang@google.com>: Internal change 261317601 by akuegel<akuegel@google.com>: Internal change 261218818 by A. Unique TensorFlower<gardener@tensorflow.org>: Internal change PiperOrigin-RevId: 261339941
-
- 01 Aug, 2019 1 commit
-
-
Haoyu Zhang authored
261171038 by gjn<gjn@google.com>: Remove weight_decay_rate 0 early exit check Removing this code path should be fine since this was actually not doing what it meant to do. Since weight_decay_rate is actually a tensor, the equality check was only looking at the id of the object and comparing to 0. This should never be true. Evaluating a tensor is also not what we want to do at this point of the code. Thus it should be fine to simply remove this code. -- 261169862 by haoyuzhang<haoyuzhang@google.com>: Internal change 261153520 by haoyuzhang<haoyuzhang@google.com>: Internal change 261140302 by hongkuny<hongkuny@google.com>: Clean up -- PiperOrigin-RevId: 261171038
-
- 31 Jul, 2019 1 commit
-
-
Toby Boyd authored
-
- 24 Jul, 2019 1 commit
-
-
Soroush Radpour authored
-
- 23 Jul, 2019 1 commit
-
-
Toby Boyd authored
* Add force_run_distributed tests. * Added enable_eager * r/force_run_distributed/force_v2_in_keras_compile * Adding force_v2 tests and FLAGs. * Rename method to avoid conflict. * Add cpu force_v2 tests. * fix lint, wrap line. * change to force_v2_in_keras_compile * Update method name. * Lower mlperf target to 0.736.
-
- 19 Jul, 2019 1 commit
-
-
Jing Li authored
* Merged commit includes the following changes: 258867180 by jingli<jingli@google.com>: Add new folders for upcoming reorg in model garden. -- 258893811 by hongkuny<hongkuny@google.com>: Adds summaries for metrics, allowing metrics inside keras.model. -- 258893048 by isaprykin<isaprykin@google.com>: Remove the `cloning` argument to `compile()`. Keras models are distributed by cloning in graph mode and without cloning in eager mode as of the change # 258652546. -- 258881002 by hongkuny<hongkuny@google.com>: Fix lint. -- 258874998 by hongkuny<hongkuny@google.com>: Internal -- 258872662 by hongkuny<hongkuny@google.com>: Fix doc -- PiperOrigin-RevId: 258867180 * Create __init__.py * Update __init__.py * Update __init__.py * Update __init__.py
-
- 21 Jun, 2019 1 commit
-
-
Toby Boyd authored
* XLA FP32 and first test * More XLA benchmarks FP32. * Add eager to NCF and refactor resnet. * fix v2_0 calls and more flag refactor. * Remove extra flag args. * 90 epoch default * add return * remove xla not used by estimator. * Remove duplicate run_eagerly. * fix flag defaults. * Remove fp16_implementation flag option. * Remove stop early on mlperf test. * remove unneeded args. * load flags from keras mains.
-
- 20 Jun, 2019 1 commit
-
-
Haoyu Zhang authored
* Do not set learning phase when skipping eval * Do not set learning phase in no dist strat case * Added device placement, tweaked benchmarks * Added tweaked benchmarks for Cifar * Fix device scope * Fix lint * Add explicit GPU placement flag * Also run accuracy test with explicit GPU placement * Added doc string
-
- 19 Jun, 2019 1 commit
-
-
Toby Boyd authored
* set default steps to 300K. * Log flags to perfzero. * Add XLA support to transformer - Moved config logic to keras_utils - Added enable_xla flag to _performance flags - Did not refactor enable_xla flag from keras resnet due to reliance on calling FLAGs in estimator keras and that is a needed refactor for another time. * fix g3 lint complaint. * Refactor set config into keras_utils. * Move flags out of main. * pipe through enable_xla * Update official/transformer/v2/misc.py Co-Authored-By:Reed <reedwm@google.com>
-
- 10 Jun, 2019 1 commit
-
-
rxsang authored
-
- 06 Jun, 2019 1 commit
-
-
Reed authored
Before, there was a global default loss scale for all models. Currently, only resnet uses loss scaling, but this will be useful once more models support it.
-
- 31 May, 2019 1 commit
-
-
Haoyu Zhang authored
* Support pure eager execution in ResNet50 * Use smaller batch size
-
- 23 May, 2019 2 commits
- 15 May, 2019 1 commit
-
-
Rachel Lim authored
* Added 'tfdata_exp' version of all benchmarks which set FLAGS.tf_data_experimental_slack = True. Renamed `data_prefetch_with_slack` to `data_delay_prefetch` (haoyu's change) to make the names more distinct. * Add flag to resnet input pipeline and surface through keras_imagenet_main.py
-
- 10 May, 2019 2 commits
-
-
Haoyu Zhang authored
* Fix trivial model to work properly with fp16 * Add comment on manual casting
-
Haoyu Zhang authored
* Do not report metrics in performance benchmarks * Rename flag
-
- 09 May, 2019 1 commit
-
-
Haoyu Zhang authored
* Add learning rate tensor. This makes training slower * Improve LearningRateSchedule with better efficiency * Fix lint error * Replace constant definition with existing one
-
- 07 May, 2019 1 commit
-
-
Haoyu Zhang authored
-
- 04 May, 2019 1 commit
-
-
Haoyu Zhang authored
* Enable CuDNN BatchNorm spatial persistent by default; Remove 2nd zero padding layer * Apply scale=False and fused=True consistently to BatchNorm layers * Undo remove padding layer * Replace zero padding with padding attribute in max pooling for better performance * Resolve comments * Revert "Replace zero padding with padding attribute in max pooling for better performance" This reverts commit ad49db057c800ecac008eec1057005bd2c08ac73.
-
- 29 Apr, 2019 1 commit
-
-
Igor authored
* Add benchmarks with the --cloning flag to Resnet and NFC. * Renamed cloning to clone_model_in_keras_dist_strat. Dropped a few tests that aren't essential. * Fixed up the formatting after re-naming the flag to a much longer name. Thanks, lint. * Fixed the lint error in nfc_common.py
-
- 25 Apr, 2019 1 commit
-
-
Haoyu Zhang authored
Reason: test failures because contrib is not available in V2 This reverts commit 325dd761.
-
- 24 Apr, 2019 2 commits
-
-
Haoyu Zhang authored
-
Haoyu Zhang authored
* Introduce a short sleep before ds.prefetch in tf.data. * Further limit dataset threads to reduce CPU contention * Tuned dataset sleep time * Rename dataset sleep flag; enable it only for Keras Graph mode
-
- 17 Apr, 2019 2 commits
-
-
rxsang authored
- 11 Apr, 2019 1 commit
-
-
rxsang authored
* Revert "Revert " Ensure static shapes when enabling XLA in Resnet Keras model (#6508)" (#6517)" This reverts commit cc9eef76. * Set `batch_size` to keras.Input in non-eager mode. Eager mode currently has OOM problem. * Add comments for enable_eager flag. * Always set drop_remainder=True. * Only set drop_remainder=True for XLA.
-
- 08 Apr, 2019 1 commit
-
-
Shining Sun authored
* add ds support for ncf * remove comments for in_top_k * avoid expanding the input layers * resolve comments and fix lint * Added some comments in code and fix lint * fix lint * add some documentation * add tensorflow imports
-
- 05 Apr, 2019 1 commit
-
-
Haoyu Zhang authored
* Add profiler callback for Keras models * Update build stats to identify time callback by type * Add warning message when both TensorBoard and profiler callbacks are used
-
- 03 Apr, 2019 4 commits
-
-
Reed authored
-
Haoyu Zhang authored
Reason: break 1-gpu nightly test. This reverts commit 371645fc.
-
Haoyu Zhang authored
-
rxsang authored
Don't pass `batch_size` to keras.layers.Input in DS multi-replica case. There is currently a bug in Keras side which will cause a batch size incompatible error.
-
- 02 Apr, 2019 1 commit
-
-
rxsang authored
* Update resnet_model.py * Ensure static shapes when enabling XLA. * Define `drop_remainder` as a variable. * Handles per_replica_batch_size in non-XLA mode * Remove trailing whitespace.
-
- 28 Mar, 2019 1 commit
-
-
Haoyu Zhang authored
-
- 26 Mar, 2019 1 commit
-
-
Yuefeng Zhou authored
required by multi-node collective ops in eager mode.
-