Commits · 87542800b599e3d8715974e9cf178e4ceebbf97a · ModelZoo / ResNet50_tensorflow

01 Aug, 2019 3 commits

Move the official ResNet (estimator version) under `official/r1` (#7355) · 87542800
Haoyu Zhang authored Aug 01, 2019
```
* Restructure resnet estimator code to under official/r1

* Continue moving resnet code...

* Improved README.md
```
87542800

Merged commit includes the following changes: (#7354) · dc4c5f1a

Haoyu Zhang authored Aug 01, 2019

261171038  by gjn<gjn@google.com>:

    Remove weight_decay_rate 0 early exit check

    Removing this code path should be fine since this was actually not doing
    what it meant to do. Since weight_decay_rate is actually a tensor, the
    equality check was only looking at the id of the object and comparing to
    0. This should never be true. Evaluating a tensor is also not what we
    want to do at this point of the code. Thus it should be fine to simply
    remove this code.

--
261169862  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

261153520  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

261140302  by hongkuny<hongkuny@google.com>:

    Clean up

--

PiperOrigin-RevId: 261171038

dc4c5f1a

Remove whitespaces from empty lines (#7353) · 144bc3c2
Haoyu Zhang authored Aug 01, 2019

144bc3c2

31 Jul, 2019 1 commit
- Change to experimental_run_tf_function. (#7344) · a552e76a
  Toby Boyd authored Jul 31, 2019
  
  a552e76a
30 Jul, 2019 1 commit
- Add accuracy and performance Resnet runs with --force_v2_in_keras_com… (#7333) · ff21bff0
  Igor authored Jul 30, 2019
```
* Add accuracy and performance Resnet runs with --force_v2_in_keras_compile = True.

* Fixed lint
```
  ff21bff0
25 Jul, 2019 1 commit
- Add Resnet50 CTL benchmark (pure eager w/ distribution strategy) · b59420dd
  Zongwei Zhou authored Jul 25, 2019
  
  b59420dd
24 Jul, 2019 2 commits
- Returning an object causes the program to exit with a non-zero code. (#7294) · 9fb1a1b6
  Soroush Radpour authored Jul 24, 2019
  
  9fb1a1b6
- fix flags to force_v2_in_keras_compile (#7287) · d09994b2
  Toby Boyd authored Jul 23, 2019
  
  d09994b2
23 Jul, 2019 1 commit

Single execution path tests for ResNet50, ResNet56, NCF, and Shakespeare LSTM. (#7276) · 9d8c9aa4

Toby Boyd authored Jul 23, 2019

* Add force_run_distributed tests.

* Added enable_eager

* r/force_run_distributed/force_v2_in_keras_compile

* Adding force_v2 tests and FLAGs.

* Rename method to avoid conflict.

* Add cpu force_v2 tests.

* fix lint, wrap line.

* change to force_v2_in_keras_compile

* Update method name.

* Lower mlperf target to 0.736.

9d8c9aa4

19 Jul, 2019 2 commits

Merged commit includes the following changes: (#7264) · 6f47c378

Igor authored Jul 19, 2019

259030078  by isaprykin<isaprykin@google.com>:

    Clean up the --clone_model_in_keras_dist_strat from Keras Resnet.

    The cloning flag has been removed.  The current rule is that cloning is only done in graph mode.  That resulted in duplicate benchmarks: eager+no-cloning vs eager+cloning.  I removed eager+cloning ones.

--
259026454  by isaprykin<isaprykin@google.com>:

    Internal change

PiperOrigin-RevId: 259030078

6f47c378

Merged commit includes the following changes: (#7263) · c5a4978d

Jing Li authored Jul 19, 2019

* Merged commit includes the following changes:
258867180  by jingli<jingli@google.com>:

    Add new folders for upcoming reorg in model garden.

--
258893811  by hongkuny<hongkuny@google.com>:

    Adds summaries for metrics, allowing metrics inside keras.model.

--
258893048  by isaprykin<isaprykin@google.com>:

    Remove the `cloning` argument to `compile()`.

    Keras models are distributed by cloning in graph mode and without cloning in eager mode as of the change # 258652546.

--
258881002  by hongkuny<hongkuny@google.com>:

    Fix lint.

--
258874998  by hongkuny<hongkuny@google.com>:

    Internal

--
258872662  by hongkuny<hongkuny@google.com>:

    Fix doc

--

PiperOrigin-RevId: 258867180

* Create __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

c5a4978d

18 Jul, 2019 1 commit

Improve Keras graph performance for ResNet56 (#7241) · dd5a91d3

Haoyu Zhang authored Jul 18, 2019

* Config threadpool, cuDNN persistent BN, and grappler layout optimizer properly for ResNet56

* Add tweaked tests for Resnet56

* Avoid triggering the last partial batch overhead by explicitly dropping remainder

dd5a91d3

11 Jul, 2019 2 commits
- Add stdev to the Dense layer. (#7189) · fa28535d
  Toby Boyd authored Jul 10, 2019
  
  fa28535d
- Move Keras Hook to use global step to resolve issues across epochs. (#7186) · f4b02d15
  Toby Boyd authored Jul 10, 2019
```
* Move to global_step.

* Hook to use global_step.

* fix comment start step 1 not step 0.

* remove hack used for testing.

* Add docstring.
```
  f4b02d15
09 Jul, 2019 1 commit
- Improve performance for Cifar ResNet benchmarks (#7178) · 2ed43e66
  Haoyu Zhang authored Jul 09, 2019
```
* Improve performance for Cifar ResNet benchmarks

* Revert batch size changes to benchmarks
```
  2ed43e66
03 Jul, 2019 1 commit

Unit tests pass TF 2.0 GPU and CPU locally. (#7101) · 49097655

Toby Boyd authored Jul 03, 2019

* Fix unit tests failures.

* 96% of TF 2.0 tests on GPU are passing.

* Currently all passing GPU and CPU TF 2.0

* Address code comments.

* use tf 2.0 cast.

* Comment about working on TF 2.0 CPU

* Uses contrib turn off for TF 2.0.

* Fix wide_deep and add keras_common_tests.

* use context to get num_gpus.

* Switch to tf.keras.metrics

49097655

22 Jun, 2019 1 commit
- Fix unit tests failures. (#7086) · 47a59023
  Toby Boyd authored Jun 22, 2019
  
  47a59023
21 Jun, 2019 2 commits

Add ResNet56 CPU benchmark and accuracy tests. (#7070) · f21337b1
Toby Boyd authored Jun 21, 2019
```
* cpu benchmark and accuracy tests.

* add docstrings to fix lint.
```
f21337b1

NCF XLA and Eager tests with a refactor of resnet flags to make this cleaner. (#7067) · a68f65f8

Toby Boyd authored Jun 21, 2019

* XLA FP32 and first test

* More XLA benchmarks FP32.

* Add eager to NCF and refactor resnet.

* fix v2_0 calls and more flag refactor.

* Remove extra flag args.

* 90 epoch default

* add return

* remove xla not used by estimator.

* Remove duplicate run_eagerly.

* fix flag defaults.

* Remove fp16_implementation flag option.

* Remove stop early on mlperf test.

* remove unneeded args.

* load flags from keras mains.

a68f65f8

20 Jun, 2019 4 commits
- Fix test that requires xla flag to be defined (#7072) · adc8f11b
  Haoyu Zhang authored Jun 20, 2019
  
  adc8f11b
- Fix resnet tests (#7071) · 092def7b
  Haoyu Zhang authored Jun 20, 2019
  
  092def7b
- Improve performance of Keras ResNet models when not using distribution strategy (#7055) · cf3c2407
  Haoyu Zhang authored Jun 20, 2019
```
* Do not set learning phase when skipping eval

* Do not set learning phase in no dist strat case

* Added device placement, tweaked benchmarks

* Added tweaked benchmarks for Cifar

* Fix device scope

* Fix lint

* Add explicit GPU placement flag

* Also run accuracy test with explicit GPU placement

* Added doc string
```
  cf3c2407
- Fix lint error (Trailing whitespace) (#7059) · 0cc52905
  anj-s authored Jun 19, 2019
```
* .

* .
```
  0cc52905
19 Jun, 2019 4 commits

Add XLA to transformer (#7048) · 269581dc

Toby Boyd authored Jun 19, 2019



* set default steps to 300K.

* Log flags to perfzero.

* Add XLA support to transformer

- Moved config logic to keras_utils
- Added enable_xla flag to _performance flags
- Did not refactor enable_xla flag from keras resnet due to
  reliance on calling FLAGs in estimator keras and that is
  a needed refactor for another time.

* fix g3 lint complaint.

* Refactor set config into keras_utils.

* Move flags out of main.

* pipe through enable_xla

* Update official/transformer/v2/misc.py
Co-Authored-By: Reed <reedwm@google.com>

269581dc

Add flags info when reporting benchmarks (#7056) · 1e527fb5

anj-s authored Jun 19, 2019

* first version of ctl

* fix indent

* remove monkey patching for core

* add dtype arg

* fix dtype arg

* add logging lib

* remove compat.v1.logging

* add datetime import

* fix FLAGS import

* add constant vals

* move to using as tf import

* move to using as tf import

* remove steps per epoch = 1

* test train and test for one step

* test train and test for one step

* test train and test for one step

* test train and test for the entire dataset

* use an iterator for test

* pass tensors instead of an iterator

* add stats dict

* fix list declaration

* fix list declaration

* fix elapsed time calc

* print lr at epoch boundary alone

* Use regular tf import instead of compat

* remove tensorboard chkpts

* add correct logging import

* add correct logging import

* add benchmark configs

* add tests and configs

* add tests and configs

* add keras flags import

* add keras flags import

* fix eval ds creation cond

* return numpy value of train_loss

* return numpy value of loss and acc values

* add option for full eager mode

* fix lint errors

* add ctl flags

* add ctl import

* add the xla flag

* enable v2 behavior in unit tests

* rename dataset var

* add synthetic dataset without monkey patching

* add ctl local constants

* add ctl local constants

* change to using v2 imports

* change to using v2 imports

* change to using v2 imports

* change to using keras synthetic input fn

* remove enable_eager flag from benchmarks

* remove enable_eager flag from benchmarks

* remove enable_eager flag from benchmarks

* add option for no distrat

* add lambda for flags

* remove no_func benchmarks due to OOM error

* remove README

* remove unused comments

* remove unchanged file

* remove unchanged file

* remove unused drop_remainder_arg

* use keras.common lr function

* address PR comments

* remove reference to deleted file

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* fix lint errors

* .

* add flags info

1e527fb5

Add benchmarks for custom training loops + tf.distribute (#6980) · 65636099

anj-s authored Jun 19, 2019

* first version of ctl

* fix indent

* remove monkey patching for core

* add dtype arg

* fix dtype arg

* add logging lib

* remove compat.v1.logging

* add datetime import

* fix FLAGS import

* add constant vals

* move to using as tf import

* move to using as tf import

* remove steps per epoch = 1

* test train and test for one step

* test train and test for one step

* test train and test for one step

* test train and test for the entire dataset

* use an iterator for test

* pass tensors instead of an iterator

* add stats dict

* fix list declaration

* fix list declaration

* fix elapsed time calc

* print lr at epoch boundary alone

* Use regular tf import instead of compat

* remove tensorboard chkpts

* add correct logging import

* add correct logging import

* add benchmark configs

* add tests and configs

* add tests and configs

* add keras flags import

* add keras flags import

* fix eval ds creation cond

* return numpy value of train_loss

* return numpy value of loss and acc values

* add option for full eager mode

* fix lint errors

* add ctl flags

* add ctl import

* add the xla flag

* enable v2 behavior in unit tests

* rename dataset var

* add synthetic dataset without monkey patching

* add ctl local constants

* add ctl local constants

* change to using v2 imports

* change to using v2 imports

* change to using v2 imports

* change to using keras synthetic input fn

* remove enable_eager flag from benchmarks

* remove enable_eager flag from benchmarks

* remove enable_eager flag from benchmarks

* add option for no distrat

* add lambda for flags

* remove no_func benchmarks due to OOM error

* remove README

* remove unused comments

* remove unchanged file

* remove unchanged file

* remove unused drop_remainder_arg

* use keras.common lr function

* address PR comments

* remove reference to deleted file

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* fix lint errors

* .

65636099

Use PerfZeroBenchmark and log Flags. (#7052) · d3610769
Toby Boyd authored Jun 19, 2019

d3610769

14 Jun, 2019 3 commits
- Fix graph.rewrite for TF 2.0 and remove num_parallel_batches (#7019) · d44b7283
  Toby Boyd authored Jun 14, 2019
```
* tf.compat.v1.train.experimental.enable_mixed_precision_graph_rewrite

* Remove num_parallel_batches which is not used.
```
  d44b7283
- Add ResNet tests for NHWC and layout optimizer off. (#7018) · 097c8051
  Toby Boyd authored Jun 14, 2019
```
* layout off for some tests and channels last.

* 8 gpu tests channels_last

* more layout off tests.
```
  097c8051
- Resnet56 forced eager benchmark and no_dist_strat eager accuracy test. (#7017) · 7b329985
  Toby Boyd authored Jun 14, 2019
```
* Add 1 gpu force_eager benchmark

* Add accuracy for no dist strat eager

* remvove return.
```
  7b329985
13 Jun, 2019 1 commit
- Add run_eagerly and end-to-end test. (#7012) · d8a09064
  Toby Boyd authored Jun 13, 2019
  
  d8a09064
10 Jun, 2019 1 commit
- Code cleanup. (#6989) · f7a44074
  rxsang authored Jun 10, 2019
  
  f7a44074
06 Jun, 2019 3 commits
- Have each model provide a default loss scale. (#6930) · 42a8af1d
  Reed authored Jun 06, 2019
```
Before, there was a global default loss scale for all models. Currently, only resnet uses loss scaling, but this will be useful once more models support it.
```
  42a8af1d
- Add pure eager fp16 benchmark (#6977) · ce797486
  Haoyu Zhang authored Jun 06, 2019
  
  ce797486
- Modify tweaked tests for better performance in no cloning mode (#6965) · 152baba5
  Haoyu Zhang authored Jun 05, 2019
```
* Modify tweaked tests for better performance in no cloning mode

* Tweak trivial models
```
  152baba5
05 Jun, 2019 1 commit
- Add more optional_next tests (#6955) · 346b570f
  rxsang authored Jun 04, 2019
  
  346b570f
04 Jun, 2019 1 commit

Multi-worker ResNet Estimator benchmarks for PerfZero. (#6950) · 302fa739

Ayush Dubey authored Jun 04, 2019

* Add multi-worker benchmarks to official resnet estimator_benchmark.py.

* fix super constructor calls

* set datasets_num_private_threads to 32 in multi worker tweaked benchmarks

302fa739

03 Jun, 2019 2 commits

Do not use XLA in warmup tests (#6951) · dcdc45bd

Haoyu Zhang authored Jun 03, 2019

Because we run warmup tests in all real data benchmarks, XLA bugs will cause non-XLA tests to fail as well.

dcdc45bd

Resnet mlperf like (#6942) · 69e2e3f6

Toby Boyd authored Jun 03, 2019

* Add mlperf like test.

* Final comments.

* docstring wording tweak.

* non-tweaked version

69e2e3f6

31 May, 2019 1 commit
- Fix internal lint errors (#6937) · 7546a9e3
  Haoyu Zhang authored May 31, 2019
  
  7546a9e3