- 03 Jul, 2019 1 commit
-
-
Toby Boyd authored
* Fix unit tests failures. * 96% of TF 2.0 tests on GPU are passing. * Currently all passing GPU and CPU TF 2.0 * Address code comments. * use tf 2.0 cast. * Comment about working on TF 2.0 CPU * Uses contrib turn off for TF 2.0. * Fix wide_deep and add keras_common_tests. * use context to get num_gpus. * Switch to tf.keras.metrics
-
- 28 Jun, 2019 1 commit
-
-
nnigania authored
* borrowing a tf1.x optimization which converts gradients from sparse to dense for better perf * cleanup after code review
-
- 24 Jun, 2019 1 commit
-
-
nnigania authored
-
- 21 Jun, 2019 1 commit
-
-
Toby Boyd authored
* XLA FP32 and first test * More XLA benchmarks FP32. * Add eager to NCF and refactor resnet. * fix v2_0 calls and more flag refactor. * Remove extra flag args. * 90 epoch default * add return * remove xla not used by estimator. * Remove duplicate run_eagerly. * fix flag defaults. * Remove fp16_implementation flag option. * Remove stop early on mlperf test. * remove unneeded args. * load flags from keras mains.
-
- 18 Jun, 2019 1 commit
-
-
nnigania authored
* adding a new perf test for ncf, and changing some names * Added change to make ncf use the data from the gcp bucket, and removed the need to re-download data >1day old. Reorganized the perf-zero tests
-
- 13 Jun, 2019 8 commits
-
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
- 05 Jun, 2019 5 commits
-
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
- 03 Jun, 2019 9 commits
-
-
guptapriya authored
* Add CTL benchmark * Divide train loss by number of train steps * increase num epochs to 10 * add benchmark for early stopping with CTL * remove whitespace
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
guptapriya authored
-
- 31 May, 2019 2 commits
-
-
Haoyu Zhang authored
-
Haoyu Zhang authored
* Fix various lint errors * Fix logging format
-
- 29 May, 2019 1 commit
-
-
Bruce Fontaine authored
* Add flag to use custom training loop for keras NCF model. * Add error check to NCF model for custom training loop + tf1.0.
-
- 28 May, 2019 3 commits
-
-
Bruce Fontaine authored
* Add a custom training loop for NCF model with TF2.0 * Fix long line in ncf_keras_main.py * Remove dataset repeat when using custom training loop.
-
guptapriya authored
-
guptapriya authored
-
- 24 May, 2019 2 commits
-
-
Priya Gupta authored
Add early stopping logic to ncf keras when desired threshold is met. Also change the default batch size to match the tuned hyperparams
-
Tian Lin authored
* Merged commit includes the following changes: 249776315 by tianlin<tianlin@google.com>: Internal change 249763206 by tianlin<tianlin@google.com>: For TF 2.0 (related to Beam Search), expand cond dims in tf.where(cond, x, y) to make all parameters broadcastable. -- 249392724 by hongkuny<hongkuny@google.com>: Internal change PiperOrigin-RevId: 249776315 * Merged commit includes the following changes: 249823043 by tianlin<tianlin@google.com>: Bring back v2 test for predict and eval. -- PiperOrigin-RevId: 249823043
-
- 23 May, 2019 2 commits
-
-
guptapriya authored
Adding validation every epoch allows us to view the progress during training instead of having to wait until the last eval. Mostly useful for manual runs.
-
guptapriya authored
Current batch size 160000 does not converge to the desired HR. So we decrease to 99k which is known to converge. Tested locally and got to 63.5 at epoch 7. Also decreasing number of epochs as I don't see any improvement after epoch 7-8.
-
- 15 May, 2019 1 commit
-
-
Igor authored
* Set the --clone_model_in_keras_dist_strat to None. Remove the separate no_cloning benchmarks and add a couple of cloning ones. Fixes the learning rate schedule to cache its ops per graph.
-
- 08 May, 2019 1 commit
-
-
Toby Boyd authored
-
- 29 Apr, 2019 1 commit
-
-
Igor authored
Replace per_device with per_replica and PerDevice with PerReplica, because the PerDevice concept was renamed and doesn't exist anymore. (#6693) * Replace per_device with per_replica and PerDevice with PerReplica, because the PerReplica concept was renamed and doesn't exist anymore.
-