Commits · 830a17eca3550c3f4d9f4878d90405b288c10a42 · ModelZoo / ResNet50_tensorflow

21 Jul, 2019 1 commit
- Add a simple signal-based Python callstack sampler for debugging · 830a17ec
  Zongwei Zhou authored Jul 19, 2019
  
  830a17ec
20 Jul, 2019 1 commit
- improved v2 check. · 49b90e86
  Toby Boyd authored Jul 19, 2019
  
  49b90e86
19 Jul, 2019 2 commits

Revert "Change how TF 2 is checked" (#7260) · 2569fa9a
Toby Boyd authored Jul 19, 2019
```
This reverts commit 712f473e.
```
2569fa9a

guptapriya authored Jul 18, 2019

The current approach checks for presence of contrib. Sometimes this is not sufficient (for e..g when testing TF 1 + enable_v2_behavior=True which is what internal tests currently do)

712f473e

18 Jul, 2019 1 commit

Refactor and add benchmarks as well as accuracy tests for GPU and CPU (#7248) · e0a2b8c3

Toby Boyd authored Jul 18, 2019

* Added benchmarks and common flags.

* Add cpu tests.

* Add tracking epoch times.

* fix transformer.

* Add examples_per_second.

* fix pylint

e0a2b8c3

11 Jul, 2019 1 commit

Move Keras Hook to use global step to resolve issues across epochs. (#7186) · f4b02d15

Toby Boyd authored Jul 10, 2019

* Move to global_step.

* Hook to use global_step.

* fix comment start step 1 not step 0.

* remove hack used for testing.

* Add docstring.

f4b02d15

03 Jul, 2019 1 commit

Unit tests pass TF 2.0 GPU and CPU locally. (#7101) · 49097655

Toby Boyd authored Jul 03, 2019

* Fix unit tests failures.

* 96% of TF 2.0 tests on GPU are passing.

* Currently all passing GPU and CPU TF 2.0

* Address code comments.

* use tf 2.0 cast.

* Comment about working on TF 2.0 CPU

* Uses contrib turn off for TF 2.0.

* Fix wide_deep and add keras_common_tests.

* use context to get num_gpus.

* Switch to tf.keras.metrics

49097655

02 Jul, 2019 1 commit
- Allow distibution_utils.py to worker with PSStrategy or none strategy (#7135) · 680eb35c
  Yuefeng Zhou authored Jul 02, 2019
```
when there are multiple workers.
```
  680eb35c
19 Jun, 2019 1 commit

Add XLA to transformer (#7048) · 269581dc

Toby Boyd authored Jun 19, 2019



* set default steps to 300K.

* Log flags to perfzero.

* Add XLA support to transformer

- Moved config logic to keras_utils
- Added enable_xla flag to _performance flags
- Did not refactor enable_xla flag from keras resnet due to
  reliance on calling FLAGs in estimator keras and that is
  a needed refactor for another time.

* fix g3 lint complaint.

* Refactor set config into keras_utils.

* Move flags out of main.

* pipe through enable_xla

* Update official/transformer/v2/misc.py
Co-Authored-By: Reed <reedwm@google.com>

269581dc

24 May, 2019 1 commit
- Moved common keras code to utils. (#6859) · 3254cabb
  Toby Boyd authored May 24, 2019
  
  3254cabb
29 Apr, 2019 1 commit

Replace per_device with per_replica and PerDevice with PerReplica, because the... · b00783d7

Igor authored Apr 29, 2019

Replace per_device with per_replica and PerDevice with PerReplica, because the PerDevice concept was renamed and doesn't exist anymore. (#6693)

* Replace per_device with per_replica and PerDevice with PerReplica, because the PerReplica concept was renamed and doesn't exist anymore.

b00783d7

26 Apr, 2019 1 commit

Add num_packs flag for MirroredStrategy's cross device ops. (#6676) · 4a1fba0b

Ayush Dubey authored Apr 26, 2019

* Add num_packs flag for MirroredStrategy's cross device ops.

* fix parens

* Fix lint errors and make all_reduce_alg more robust.

* Set default num_packs to 1

4a1fba0b

25 Apr, 2019 1 commit
- Remove contrib cross device ops and update all_reduce_alg options. (#6673) · ece99414
  Ayush Dubey authored Apr 25, 2019
```
* Remove contrib AllReduceCrossDeviceOps and update all_reduce_alg options with MirroredStrategy.

* cleanup
```
  ece99414
24 Apr, 2019 1 commit
- Update distribution_utils.py (#6615) · 98672351
  Yuefeng Zhou authored Apr 24, 2019
  
  98672351
11 Apr, 2019 1 commit

Make BatchTimestamp object printable. (#6557) · 80dde852

rxsang authored Apr 10, 2019

* Make BatchTimestamp object printable.

* Removing trailing whitespace.

* Make BatchTimestamp repr a string.

80dde852

08 Apr, 2019 1 commit

Add DS support for NCF keras (#6447) · 1255d5b9

Shining Sun authored Apr 08, 2019

* add ds support for ncf

* remove comments for in_top_k

* avoid expanding the input layers

* resolve comments and fix lint

* Added some comments in code and fix lint

* fix lint

* add some documentation

* add tensorflow imports

1255d5b9

01 Apr, 2019 1 commit
- Add synthetic data monkey patch to OneDeviceStrategy as well (#6505) · d9823dae
  Haoyu Zhang authored Apr 01, 2019
  
  d9823dae
29 Mar, 2019 1 commit
- fix a typo in doc string (#6475) · 5775220a
  Shining Sun authored Mar 28, 2019
  
  5775220a
28 Mar, 2019 1 commit

Added benchmark test and convergence test for the NCF model (#6318) · 4c11b84b

Shining Sun authored Mar 28, 2019

* initial commit

* bug fix

* Move build_stats from common to keras main, because it is only applicable in keras

* remove tailing blank line

* add test for synth data

* add kwargs to init

* add kwargs to function invokation

* correctly pass kwargs

* debug

* debug

* debug

* fix super init

* bug fix

* fix local_flags

* fix import

* bug fix

* fix log_steps flag

* bug fix

* bug fix: add missing return value

* resolve double-defined flags

* lint fix

* move log_steps flag to benchmarK flag

* fix lint

* lint fix

* lint fix

* try flag core default values

* bug fix

* bug fix

* bug fix

* debug

* debug

* remove debug prints

* rename benchmark methods

* flag bug fix for synth benchmark

4c11b84b

19 Mar, 2019 1 commit
- Add the option to run Keras resnet model on multiple workers. (#6368) · 3024bde6
  Soroush Radpour authored Mar 19, 2019
  
  3024bde6
07 Mar, 2019 1 commit

Add command line option for multi worker collective implementations, disable checkpointing. (#6317) · 05a79f5a

Ayush Dubey authored Mar 07, 2019

* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy

* More s/contrib.distribute/distribute.experimental

* Collective communication options in MultiWorkerMirroredStrategy.

* Minor fixes

* No checkpointing if multi worker.

* turn off checkpointing

* fix lint

05a79f5a

02 Mar, 2019 1 commit
- fix resnet breakage and add keras end-to-end tests (#6295) · 8367cf6d
  Taylor Robie authored Mar 02, 2019
```
* fix resnet breakage and add keras end-to-end tests

* delint

* address PR comments
```
  8367cf6d
01 Mar, 2019 1 commit

Keras-fy NCF Model (#6092) · 048e5bff

Shining Sun authored Mar 01, 2019

* tmp commit

* tmp commit

* first attempt (without eval)

* Bug fixes

* bug fixes

* training done

* Loss NAN, no eval

* Loss weight problem solved

* resolve the NAN loss problem

* Problem solved. Clean up needed

* Added a todo

* Remove debug prints

* Extract get_optimizer to ncf_common

* Move metrics computation back to neumf; use DS.scope api

* Extract DS.scope code to utils

* lint fixes

* Move obtaining DS above producer.start to avoid race condition

* move pt 1

* move pt 2

* Update the run script

* Wrap keras_model related code into functions

* Update the doc for softmax_logitfy and change the method name

* Resolve PR comments

* working version with: eager, DS, batch and no masks

* Remove git conflict indicator

* move reshape to neumf_model

* working version, not converge

* converged

* fix a test

* more lint fix

* more lint fix

* more lint fixes

* more lint fix

* Removed unused imports

* fix test

* dummy commit for kicking of checks

* fix lint issue

* dummy input to kick off checks

* dummy input to kick off checks

* add collective to dist strat

* addressed review comments

* add a doc string

048e5bff

28 Feb, 2019 2 commits
- Change `CollectiveAllReduceStrategy` to `MultiWorkerMirroredStrategy`. (#6282) · d793ea82
  Ayush Dubey authored Feb 28, 2019
```
* s/CollectiveAllReduceStrategy/MultiWorkerMirroredStrategy

* More s/contrib.distribute/distribute.experimental
```
  d793ea82
- Updating stale DistributionStrategy test. (#6281) · 4b566d4e
  Tayo Oguntebi authored Feb 28, 2019
  
  4b566d4e
21 Feb, 2019 1 commit

Multi-worker support for Resnet. (#6206) · f2e90945

Ayush Dubey authored Feb 21, 2019

* Update official resnet for multi worker training with distribution strategies.

* Fixes for multi worker training.

* Fix call to `get_distribution_strategy`.

* Undo test change.

* Fix spacing.

* Move cluster configuration to distribution_utils.

* Move train_and_evaluate out of loop.  Also, update docstrings for multi-worker flags and add use_train_and_evaluate flag.

* Update distribution_strategy flag to match exported name for collective strategy.

f2e90945

14 Feb, 2019 1 commit
- One device strat (#6196) · b66ef95e
  Toby Boyd authored Feb 13, 2019
```
* One device from contrib to core.

* remove test code.
```
  b66ef95e
13 Feb, 2019 1 commit

Add a flag to specify distribution strategies. (#6185) · 79b57a3f

Yuefeng Zhou authored Feb 12, 2019

* Add a flag to specify distribution strategies.

* Fix a small error.

* Address comments.

* Address comments.

* Fix typos.

79b57a3f

12 Feb, 2019 1 commit

V2 contrib tweaks (#6184) · a1ee97e6

Toby Boyd authored Feb 11, 2019

* Remove contrib thread pool.

* Remove commented out contrib import.

* Fix lint issues.

* move tf.data.options higher. Tweak line breaks.

* do not monkey patch on or off if dist_strat is off

* Do not monkey patch if no_dist_strat.

* Fix file permissions.

* fix file permissions.

* Revert change to main.  Add hasattr(tf, 'contrib') to utils

* compat.v1.logging

* tf.compat.v1.get_local_variables.

a1ee97e6

11 Feb, 2019 1 commit

Remove contrib thread pool. (#6175) · b6c0c7f9

Toby Boyd authored Feb 11, 2019

* Remove contrib thread pool.

* Remove commented out contrib import.

* Fix lint issues.

* move tf.data.options higher. Tweak line breaks.

b6c0c7f9

09 Feb, 2019 1 commit

Add pure synthetic data to keras resnet model. (#6174) · 05383c7b

Yuefeng Zhou authored Feb 08, 2019

* Add pure synthetic data to keras resnet mode.

* Add imports.

* Address comments.

* update comment

* Undo set up synthetic data for real data path.

* update comment

* Address comment

* Remove trailing whiltespaces.

* s/make_data_set_iterator/make_dataset_iterator/

05383c7b

08 Feb, 2019 1 commit
- Revert "Revert "tf_upgrade_v2 on resnet and utils folders. (#6154)" (#6162)" (#6167) · b2c9e3f5
  Goldie Gadde authored Feb 08, 2019
```
This reverts commit 57e07520.
```
  b2c9e3f5
06 Feb, 2019 1 commit
- Revert "tf_upgrade_v2 on resnet and utils folders. (#6154)" (#6162) · 57e07520
  Goldie Gadde authored Feb 06, 2019
```
This reverts commit d6b2b83c.
```
  57e07520
05 Feb, 2019 1 commit

tf_upgrade_v2 on resnet and utils folders. (#6154) · d6b2b83c

Goldie Gadde authored Feb 05, 2019

* Add resnet56 short tests. (#6101)

* Add resnet56 short tests.
- created base benchmark module
- renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
- short tests are mostly copied from shining with oss refactor

* Address feedback.

* Move flag_methods to init
- Address setting default flags repeatedly.

* Rename accuracy tests.

* Lint errors resolved.

* fix model_dir set to flags.data_dir.

* fixed not fulling pulling out flag_methods.

* Use core mirrored strategy in official models (#6126)

* Imagenet short tests (#6132)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* points default data_dir to child folder. (#6131)

Failed test is python2  and was a kokoro failure

* Imagenet short tests (#6136)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* Add fill_objects

* fixed calling wrong class in super.

* fix lint issue.

* Flag (#6121)

* Fix the turn_off_ds flag problem

* add param names to all args

* Export benchmark stats using tf.test.Benchmark.report_benchmark() (#6103)

* Export benchmark stats using tf.test.Benchmark.report_benchmark()

* Fix python style using pyformat

* Typos. (#6120)

* log verbosity=2 logs every epoch no progress bars (#6142)

* tf_upgrade_v2 on resnet and utils folder.

* tf_upgrade_v2 on resnet and utils folder.

d6b2b83c

01 Feb, 2019 1 commit
- Use core mirrored strategy in official models (#6126) · a66d4713
  guptapriya authored Jan 31, 2019
  
  a66d4713
27 Dec, 2018 1 commit
- Fixed lint and flag issues · 03c35ec6
  Shining Sun authored Dec 27, 2018
  
  03c35ec6
24 Dec, 2018 1 commit
- fix lint errors. · 122bb012
  Toby Boyd authored Dec 23, 2018
  
  122bb012
21 Dec, 2018 1 commit
- bug fixes · c923a420
  Shining Sun authored Dec 21, 2018
  
  c923a420
20 Dec, 2018 2 commits
- bug fixes and clean ups · 6f881f77
  Shining Sun authored Dec 20, 2018
  
  6f881f77
- Inlude the distribution_utils file · b1b4c805
  Shining Sun authored Dec 19, 2018
  
  b1b4c805