Commits · 9f9d07e99f2c533ebc09de02b1faee191c4c4ac6 · ModelZoo / ResNet50_tensorflow

23 Jun, 2021 1 commit

Return default strategy from get_distribution_strategy when given "off". · 9f9d07e9

Reed Wanderman-Milne authored Jun 22, 2021

Before, it returned None. But almost every use of get_distribution_strategy() assumes an actual strategy is returned and crashes when None is returned. Returning the default strategy fixes these issues and is equivalent to using no strategy, as the default strategy is always in effect when no other strategy is used.

PiperOrigin-RevId: 380951055

9f9d07e9

08 Mar, 2021 1 commit
- Fixes the type of the ParameterServerStrategy. · deaf1951
  Yeqing Li authored Mar 08, 2021
```
PiperOrigin-RevId: 361587168
```
  deaf1951
25 Feb, 2021 1 commit

Fix issue where distribution_strategy="off" did not work. · 4864e795

Reed Wanderman-Milne authored Feb 24, 2021

Also improve error message when distribution=off is specified without properly quoting "off"

PiperOrigin-RevId: 359395329

4864e795

05 Jan, 2021 1 commit
- #KerasNLP Update TransformerEncoderBlock to support Q, KV as two input streams. · 86df41f7
  Hongkun Yu authored Jan 05, 2021
```
PiperOrigin-RevId: 350170448
```
  86df41f7
29 Dec, 2020 1 commit
- Internal change · 7786b741
  Hongkun Yu authored Dec 28, 2020
```
PiperOrigin-RevId: 349347368
```
  7786b741
17 Nov, 2020 1 commit
- [Cleanup] Replace tf.distribute.experimental.TPUStrategy with tf.distribute.TPUStrategy · 9ff61873
  Hongkun Yu authored Nov 16, 2020
```
PiperOrigin-RevId: 342770296
```
  9ff61873
01 Oct, 2020 1 commit

PSv2: Replace existing `tf.distribute.experimental.ParameterServerStrategy`... · 4680f2fa

Rick Chao authored Oct 01, 2020

PSv2: Replace existing `tf.distribute.experimental.ParameterServerStrategy` usage with `tf.compat.v1.distribute.experimental.ParameterServerStrategy` to prepare for the upcoming TF2 ParameterServerStrategy API release.

The practically only difference for API endpoint switch is the monitoring from V2 to V1, for those who were using `tf.distribute.experimental.ParameterServerStrategy`. It's not supported in V2 and should be tracked as V1 anyway.

PiperOrigin-RevId: 334847114

4680f2fa

17 Sep, 2020 1 commit
- Internal change · f1f0503c
  Hongkun Yu authored Sep 17, 2020
```
PiperOrigin-RevId: 332314917
```
  f1f0503c
13 Sep, 2020 1 commit
- [Clean up] Consolidate distribution utils. · 33a4c207
  Hongkun Yu authored Sep 12, 2020
```
PiperOrigin-RevId: 331359058
```
  33a4c207
12 Aug, 2020 2 commits
- Internal change · 999fae62
  Hongkun Yu authored Aug 12, 2020
```
PiperOrigin-RevId: 326286926
```
  999fae62
- Internal change · 88253ce5
  Hongkun Yu authored Aug 12, 2020
```
PiperOrigin-RevId: 326286926
```
  88253ce5
10 Apr, 2020 1 commit
- Internal change · a8b5cb7a
  Hongkun Yu authored Apr 10, 2020
```
PiperOrigin-RevId: 305897677
```
  a8b5cb7a
30 Mar, 2020 1 commit
- Move a R1 specific util function from common utils to R1 models. · fc02382c
  Hongkun Yu authored Mar 30, 2020
```
PiperOrigin-RevId: 303767122
```
  fc02382c
17 Mar, 2020 1 commit
- tf.compat.v1.logging implemented with absl · 3043566d
  ayushmankumar7 authored Mar 18, 2020
  
  3043566d
13 Feb, 2020 1 commit
- [Refactor] TF models: move all contents of transformer to nlp/transformer · 7f926353
  Hongkun Yu authored Feb 13, 2020
```
PiperOrigin-RevId: 294997928
```
  7f926353
27 Jan, 2020 1 commit
- Remove 'num_workers' arg from get_distribution_strategy() method. · 1af7172d
  Yanhui Liang authored Jan 27, 2020
```
PiperOrigin-RevId: 291810091
```
  1af7172d
15 Dec, 2019 1 commit
- Clearly demarcate contrib symbols from standard tf symbols by importing them directly. · 722d9e57
  Hongkun Yu authored Dec 14, 2019
```
PiperOrigin-RevId: 285618209
```
  722d9e57
14 Dec, 2019 2 commits
- Internal change · 0788a23c
  Hongkun Yu authored Dec 13, 2019
```
PiperOrigin-RevId: 285533511
```
  0788a23c
- Clearly demarcate contrib symbols from standard tf symbols by importing them directly. · 357f30f4
  A. Unique TensorFlower authored Dec 13, 2019
```
PiperOrigin-RevId: 285503670
```
  357f30f4
27 Nov, 2019 1 commit
- Remove 'default' in get_distribution_strategy which is complex and error-prone · 04256053
  Hongkun Yu authored Nov 26, 2019
```
PiperOrigin-RevId: 282669615
```
  04256053
24 Sep, 2019 1 commit
- Use experimental_connect_to_cluster API in TPU lib to support training on a slice of a TPU pod. · 497989e0
  Bruce Fontaine authored Sep 24, 2019
```
PiperOrigin-RevId: 270926016
```
  497989e0
19 Aug, 2019 1 commit
- Extend synthetic data monkey patch to MultiWorkerMirroredStrategy. · ee584397
  Ayush Dubey authored Aug 19, 2019
```
PiperOrigin-RevId: 264244022
```
  ee584397
16 Aug, 2019 2 commits
- Consolidation & readability. · b1d9ac5b
  Hongkun Yu authored Aug 16, 2019
```
PiperOrigin-RevId: 263863438
```
  b1d9ac5b
- fix monkey patch for synthetic data for resnet keras model. · 1f2cebfa
  Priya Gupta authored Aug 16, 2019
```
PiperOrigin-RevId: 263854996
```
  1f2cebfa
12 Aug, 2019 1 commit

Merged commit includes the following changes: (#7430) · 03b4a0af

Hongjun Choi authored Aug 12, 2019

262988559  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Enable NCF TF 2.0 model to run on TPUStrategy.

--
262971756  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

262967691  by hongkuny<hongkuny@google.com>:

    Internal

--

PiperOrigin-RevId: 262988559

03b4a0af

02 Jul, 2019 1 commit
- Allow distibution_utils.py to worker with PSStrategy or none strategy (#7135) · 680eb35c
  Yuefeng Zhou authored Jul 02, 2019
```
when there are multiple workers.
```
  680eb35c
29 Apr, 2019 1 commit

Replace per_device with per_replica and PerDevice with PerReplica, because the... · b00783d7

Igor authored Apr 29, 2019

Replace per_device with per_replica and PerDevice with PerReplica, because the PerDevice concept was renamed and doesn't exist anymore. (#6693)

* Replace per_device with per_replica and PerDevice with PerReplica, because the PerReplica concept was renamed and doesn't exist anymore.

b00783d7

26 Apr, 2019 1 commit

Add num_packs flag for MirroredStrategy's cross device ops. (#6676) · 4a1fba0b

Ayush Dubey authored Apr 26, 2019

* Add num_packs flag for MirroredStrategy's cross device ops.

* fix parens

* Fix lint errors and make all_reduce_alg more robust.

* Set default num_packs to 1

4a1fba0b

25 Apr, 2019 1 commit
- Remove contrib cross device ops and update all_reduce_alg options. (#6673) · ece99414
  Ayush Dubey authored Apr 25, 2019
```
* Remove contrib AllReduceCrossDeviceOps and update all_reduce_alg options with MirroredStrategy.

* cleanup
```
  ece99414
24 Apr, 2019 1 commit
- Update distribution_utils.py (#6615) · 98672351
  Yuefeng Zhou authored Apr 24, 2019
  
  98672351
08 Apr, 2019 1 commit

Add DS support for NCF keras (#6447) · 1255d5b9

Shining Sun authored Apr 08, 2019

* add ds support for ncf

* remove comments for in_top_k

* avoid expanding the input layers

* resolve comments and fix lint

* Added some comments in code and fix lint

* fix lint

* add some documentation

* add tensorflow imports

1255d5b9

01 Apr, 2019 1 commit
- Add synthetic data monkey patch to OneDeviceStrategy as well (#6505) · d9823dae
  Haoyu Zhang authored Apr 01, 2019
  
  d9823dae
19 Mar, 2019 1 commit
- Add the option to run Keras resnet model on multiple workers. (#6368) · 3024bde6
  Soroush Radpour authored Mar 19, 2019
  
  3024bde6
07 Mar, 2019 1 commit

Add command line option for multi worker collective implementations, disable checkpointing. (#6317) · 05a79f5a

Ayush Dubey authored Mar 07, 2019

* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy

* More s/contrib.distribute/distribute.experimental

* Collective communication options in MultiWorkerMirroredStrategy.

* Minor fixes

* No checkpointing if multi worker.

* turn off checkpointing

* fix lint

05a79f5a

02 Mar, 2019 1 commit
- fix resnet breakage and add keras end-to-end tests (#6295) · 8367cf6d
  Taylor Robie authored Mar 02, 2019
```
* fix resnet breakage and add keras end-to-end tests

* delint

* address PR comments
```
  8367cf6d
01 Mar, 2019 1 commit

Keras-fy NCF Model (#6092) · 048e5bff

Shining Sun authored Mar 01, 2019

* tmp commit

* tmp commit

* first attempt (without eval)

* Bug fixes

* bug fixes

* training done

* Loss NAN, no eval

* Loss weight problem solved

* resolve the NAN loss problem

* Problem solved. Clean up needed

* Added a todo

* Remove debug prints

* Extract get_optimizer to ncf_common

* Move metrics computation back to neumf; use DS.scope api

* Extract DS.scope code to utils

* lint fixes

* Move obtaining DS above producer.start to avoid race condition

* move pt 1

* move pt 2

* Update the run script

* Wrap keras_model related code into functions

* Update the doc for softmax_logitfy and change the method name

* Resolve PR comments

* working version with: eager, DS, batch and no masks

* Remove git conflict indicator

* move reshape to neumf_model

* working version, not converge

* converged

* fix a test

* more lint fix

* more lint fix

* more lint fixes

* more lint fix

* Removed unused imports

* fix test

* dummy commit for kicking of checks

* fix lint issue

* dummy input to kick off checks

* dummy input to kick off checks

* add collective to dist strat

* addressed review comments

* add a doc string

048e5bff

28 Feb, 2019 1 commit
- Change `CollectiveAllReduceStrategy` to `MultiWorkerMirroredStrategy`. (#6282) · d793ea82
  Ayush Dubey authored Feb 28, 2019
```
* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy

* More s/contrib.distribute/distribute.experimental
```
  d793ea82
21 Feb, 2019 1 commit

Multi-worker support for Resnet. (#6206) · f2e90945

Ayush Dubey authored Feb 21, 2019

* Update official resnet for multi worker training with distribution strategies.

* Fixes for multi worker training.

* Fix call to `get_distribution_strategy`.

* Undo test change.

* Fix spacing.

* Move cluster configuration to distribution_utils.

* Move train_and_evaluate out of loop.  Also, update docstrings for multi-worker flags and add use_train_and_evaluate flag.

* Update distribution_strategy flag to match exported name for collective strategy.

f2e90945

14 Feb, 2019 1 commit
- One device strat (#6196) · b66ef95e
  Toby Boyd authored Feb 13, 2019
```
* One device from contrib to core.

* remove test code.
```
  b66ef95e
13 Feb, 2019 1 commit

Add a flag to specify distribution strategies. (#6185) · 79b57a3f

Yuefeng Zhou authored Feb 12, 2019

* Add a flag to specify distribution strategies.

* Fix a small error.

* Address comments.

* Address comments.

* Fix typos.

79b57a3f