- 23 Jun, 2021 1 commit
-
-
Reed Wanderman-Milne authored
Before, it returned None. But almost every use of get_distribution_strategy() assumes an actual strategy is returned and crashes when None is returned. Returning the default strategy fixes these issues and is equivalent to using no strategy, as the default strategy is always in effect when no other strategy is used. PiperOrigin-RevId: 380951055
-
- 08 Mar, 2021 1 commit
-
-
Yeqing Li authored
PiperOrigin-RevId: 361587168
-
- 25 Feb, 2021 1 commit
-
-
Reed Wanderman-Milne authored
Also improve error message when distribution=off is specified without properly quoting "off" PiperOrigin-RevId: 359395329
-
- 05 Jan, 2021 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 350170448
-
- 29 Dec, 2020 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 349347368
-
- 17 Nov, 2020 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 342770296
-
- 01 Oct, 2020 1 commit
-
-
Rick Chao authored
PSv2: Replace existing `tf.distribute.experimental.ParameterServerStrategy` usage with `tf.compat.v1.distribute.experimental.ParameterServerStrategy` to prepare for the upcoming TF2 ParameterServerStrategy API release. The practically only difference for API endpoint switch is the monitoring from V2 to V1, for those who were using `tf.distribute.experimental.ParameterServerStrategy`. It's not supported in V2 and should be tracked as V1 anyway. PiperOrigin-RevId: 334847114
-
- 17 Sep, 2020 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 332314917
-
- 13 Sep, 2020 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 331359058
-
- 12 Aug, 2020 2 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 326286926
-
Hongkun Yu authored
PiperOrigin-RevId: 326286926
-
- 10 Apr, 2020 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 305897677
-
- 30 Mar, 2020 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 303767122
-
- 17 Mar, 2020 1 commit
-
-
ayushmankumar7 authored
-
- 13 Feb, 2020 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 294997928
-
- 27 Jan, 2020 1 commit
-
-
Yanhui Liang authored
PiperOrigin-RevId: 291810091
-
- 15 Dec, 2019 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 285618209
-
- 14 Dec, 2019 2 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 285533511
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 285503670
-
- 27 Nov, 2019 1 commit
-
-
Hongkun Yu authored
PiperOrigin-RevId: 282669615
-
- 24 Sep, 2019 1 commit
-
-
Bruce Fontaine authored
PiperOrigin-RevId: 270926016
-
- 19 Aug, 2019 1 commit
-
-
Ayush Dubey authored
PiperOrigin-RevId: 264244022
-
- 16 Aug, 2019 2 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 263863438
-
Priya Gupta authored
PiperOrigin-RevId: 263854996
-
- 12 Aug, 2019 1 commit
-
-
Hongjun Choi authored
262988559 by A. Unique TensorFlower<gardener@tensorflow.org>: Enable NCF TF 2.0 model to run on TPUStrategy. -- 262971756 by A. Unique TensorFlower<gardener@tensorflow.org>: Internal change 262967691 by hongkuny<hongkuny@google.com>: Internal -- PiperOrigin-RevId: 262988559
-
- 02 Jul, 2019 1 commit
-
-
Yuefeng Zhou authored
when there are multiple workers.
-
- 29 Apr, 2019 1 commit
-
-
Igor authored
Replace per_device with per_replica and PerDevice with PerReplica, because the PerDevice concept was renamed and doesn't exist anymore. (#6693) * Replace per_device with per_replica and PerDevice with PerReplica, because the PerReplica concept was renamed and doesn't exist anymore.
-
- 26 Apr, 2019 1 commit
-
-
Ayush Dubey authored
* Add num_packs flag for MirroredStrategy's cross device ops. * fix parens * Fix lint errors and make all_reduce_alg more robust. * Set default num_packs to 1
-
- 25 Apr, 2019 1 commit
-
-
Ayush Dubey authored
* Remove contrib AllReduceCrossDeviceOps and update all_reduce_alg options with MirroredStrategy. * cleanup
-
- 24 Apr, 2019 1 commit
-
-
Yuefeng Zhou authored
-
- 08 Apr, 2019 1 commit
-
-
Shining Sun authored
* add ds support for ncf * remove comments for in_top_k * avoid expanding the input layers * resolve comments and fix lint * Added some comments in code and fix lint * fix lint * add some documentation * add tensorflow imports
-
- 01 Apr, 2019 1 commit
-
-
Haoyu Zhang authored
-
- 19 Mar, 2019 1 commit
-
-
Soroush Radpour authored
-
- 07 Mar, 2019 1 commit
-
-
Ayush Dubey authored
* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy * More s/contrib.distribute/distribute.experimental * Collective communication options in MultiWorkerMirroredStrategy. * Minor fixes * No checkpointing if multi worker. * turn off checkpointing * fix lint
-
- 02 Mar, 2019 1 commit
-
-
Taylor Robie authored
* fix resnet breakage and add keras end-to-end tests * delint * address PR comments
-
- 01 Mar, 2019 1 commit
-
-
Shining Sun authored
* tmp commit * tmp commit * first attempt (without eval) * Bug fixes * bug fixes * training done * Loss NAN, no eval * Loss weight problem solved * resolve the NAN loss problem * Problem solved. Clean up needed * Added a todo * Remove debug prints * Extract get_optimizer to ncf_common * Move metrics computation back to neumf; use DS.scope api * Extract DS.scope code to utils * lint fixes * Move obtaining DS above producer.start to avoid race condition * move pt 1 * move pt 2 * Update the run script * Wrap keras_model related code into functions * Update the doc for softmax_logitfy and change the method name * Resolve PR comments * working version with: eager, DS, batch and no masks * Remove git conflict indicator * move reshape to neumf_model * working version, not converge * converged * fix a test * more lint fix * more lint fix * more lint fixes * more lint fix * Removed unused imports * fix test * dummy commit for kicking of checks * fix lint issue * dummy input to kick off checks * dummy input to kick off checks * add collective to dist strat * addressed review comments * add a doc string
-
- 28 Feb, 2019 1 commit
-
-
Ayush Dubey authored
* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy * More s/contrib.distribute/distribute.experimental
-
- 21 Feb, 2019 1 commit
-
-
Ayush Dubey authored
* Update official resnet for multi worker training with distribution strategies. * Fixes for multi worker training. * Fix call to `get_distribution_strategy`. * Undo test change. * Fix spacing. * Move cluster configuration to distribution_utils. * Move train_and_evaluate out of loop. Also, update docstrings for multi-worker flags and add use_train_and_evaluate flag. * Update distribution_strategy flag to match exported name for collective strategy.
-
- 14 Feb, 2019 1 commit
-
-
Toby Boyd authored
* One device from contrib to core. * remove test code.
-
- 13 Feb, 2019 1 commit
-
-
Yuefeng Zhou authored
* Add a flag to specify distribution strategies. * Fix a small error. * Address comments. * Address comments. * Fix typos.
-