Commits · 1fa648a753b877f18ca3a1de9bb921c3f024c11d · ModelZoo / ResNet50_tensorflow

06 Jun, 2021 2 commits
- Internal change · 1fa648a7
  Hongkun Yu authored Jun 06, 2021
```
PiperOrigin-RevId: 377803367
```
  1fa648a7
- Internal change · ed7d404f
  Hongkun Yu authored Jun 06, 2021
```
PiperOrigin-RevId: 377801393
```
  ed7d404f
09 Apr, 2021 1 commit

Remove dynamic_loss_scale argument to define_performance. · e353e4e5

Reed Wanderman-Milne authored Apr 09, 2021

All models which support loss scaling support dynamic loss scaling, so the argument has no purpose. It used to be that some models scaled the loss manually instead of using a LossScaleOptimizer, and so did not support dynamic loss scaling.

PiperOrigin-RevId: 367719521

e353e4e5

06 Apr, 2021 1 commit

Use tf.compat.v1 version of enable_mixed_precision_graph_rewrite. · 5cdbcac3

Reed Wanderman-Milne authored Apr 06, 2021

The function `tf.train.experimental.enable_mixed_precision_graph_rewrite` will be removed from the TF2 namespace soon, at which point it will only be accessible under tf.compat.v1.

PiperOrigin-RevId: 367046393

5cdbcac3

28 Feb, 2021 2 commits
- Internal change · e9199d10
  Hongkun Yu authored Feb 27, 2021
```
PiperOrigin-RevId: 359994674
```
  e9199d10
- Internal change · c9779aa4
  Hongkun Yu authored Feb 27, 2021
```
PiperOrigin-RevId: 359990341
```
  c9779aa4
24 Jan, 2021 1 commit
- Internal change · 316a2977
  Hongkun Yu authored Jan 24, 2021
```
PiperOrigin-RevId: 353533479
```
  316a2977
12 Aug, 2020 2 commits
- Internal change · 999fae62
  Hongkun Yu authored Aug 12, 2020
```
PiperOrigin-RevId: 326286926
```
  999fae62
- Internal change · 88253ce5
  Hongkun Yu authored Aug 12, 2020
```
PiperOrigin-RevId: 326286926
```
  88253ce5
29 Apr, 2020 1 commit
- [Clean up] Move utils/logs to r1/utils. · ec7fbf0d
  Hongkun Yu authored Apr 29, 2020
```
PiperOrigin-RevId: 309079916
```
  ec7fbf0d
25 Apr, 2020 1 commit
- Handle non-login shell case in stdout utility · e61c0ebc
  Sergey Mironov authored Apr 26, 2020
  
  e61c0ebc
22 Apr, 2020 1 commit
- BenchmarkBigQueryLogger is never used. · 0e4029f0
  Hongkun Yu authored Apr 21, 2020
```
The logger was probably replaced by perfzero(?).

PiperOrigin-RevId: 307756692
```
  0e4029f0
17 Mar, 2020 1 commit
- tf.compat.v1.logging implemented with absl · 3043566d
  ayushmankumar7 authored Mar 18, 2020
  
  3043566d
05 Mar, 2020 1 commit
- Remove force_v2_in_keras_compile. experimental_run_tf_function is no-op now. · d3d7f15f
  Hongkun Yu authored Mar 05, 2020
```
PiperOrigin-RevId: 299160422
```
  d3d7f15f
02 Mar, 2020 1 commit
- Add TimeHistory callback to BERT. · 533d1e6b
  Will Cromar authored Mar 02, 2020
```
PiperOrigin-RevId: 298466825
```
  533d1e6b
25 Feb, 2020 1 commit
- Creates modeling/performance.py to include mix prediction related stuff · fb35d6be
  Hongkun Yu authored Feb 24, 2020
```
PiperOrigin-RevId: 297002741
```
  fb35d6be
27 Nov, 2019 1 commit
- Remove 'default' in get_distribution_strategy which is complex and error-prone · 04256053
  Hongkun Yu authored Nov 26, 2019
```
PiperOrigin-RevId: 282669615
```
  04256053
28 Oct, 2019 1 commit
- Add Resnet50 benchmark suite that read training data from remote storage · 06f22a59
  Zongwei Zhou authored Oct 28, 2019
```
PiperOrigin-RevId: 277082247
```
  06f22a59
21 Oct, 2019 1 commit
- Fix typo in utils/flags/_conventions.py · c3bf4a79
  minoring authored Oct 21, 2019
```
arparse -> argparse
```
  c3bf4a79
16 Oct, 2019 1 commit

Add support for the tf.keras.mixed_precision API in NCF · cb913691

Reed Wanderman-Milne authored Oct 16, 2019

To test, I did 50 fp32 runs and 50 fp16 runs. I used the following command:

python ncf_keras_main.py --dataset=ml-20m --num_gpus=1 --train_epochs=10 --clean --batch_size=99000 --learning_rate=0.00382059 --beta1=0.783529 --beta2=0.909003 --epsilon=1.45439e-7 --layers=256,256,128,64 --num_factors=64 --hr_threshold=0.635 --ml_perf --nouse_synthetic_data --data_dir ~/ncf_data_dir_python3 --model_dir ~/tmp_model_dir --keras_use_ctl

For the fp16 runs, I added --dtype=fp16. The average hit-rate for both fp16 and fp32 was 0.6365. I also did 50 runs with the mixed precision graph rewrite, and the average hit-rate was 0.6363. The difference is likely due to noise.

PiperOrigin-RevId: 275059871

cb913691

07 Oct, 2019 1 commit
- Internal change · 41293260
  A. Unique TensorFlower authored Oct 07, 2019
```
PiperOrigin-RevId: 273371605
```
  41293260
09 Sep, 2019 1 commit

Unexpose some flags from models which do not use them. · e91c41c2

Reed Wanderman-Milne authored Sep 09, 2019

--stop_threshold, --num_gpu, --hooks, --export_dir, and --distribution_strategy have been unexposed from models which do not use them

PiperOrigin-RevId: 268032080

e91c41c2

04 Sep, 2019 1 commit

Unexpose some flags from models which do not use them. · a85c40e3

Reed Wanderman-Milne authored Sep 03, 2019

--clean, --train_epochs, and --epochs_between_evals have been unexposed from models which do not use them

PiperOrigin-RevId: 267065651

a85c40e3

30 Aug, 2019 1 commit
- Fix bug where dynamic loss scaling was broken · 765da424
  Reed Wanderman-Milne authored Aug 30, 2019
```
PiperOrigin-RevId: 266376708
```
  765da424
26 Aug, 2019 1 commit

Unexpose some flags from models which do not use them. · 560b3af4

Reed Wanderman-Milne authored Aug 26, 2019

--synthetic_data, --dtype, --all_reduce_alg, and --num_packs have been unexposed from models which do not use them

PiperOrigin-RevId: 265483564

560b3af4

23 Aug, 2019 1 commit

Unexpose some flags from models which do not use them. · 882e51a4

Reed Wanderman-Milne authored Aug 22, 2019

--num_parallel_calls, --inter_op_parallelism_threads, and --intra_op_parallelism_threads have been unexposed from models which do not use them

PiperOrigin-RevId: 264965788

882e51a4

20 Aug, 2019 2 commits
- fix transformer amp · c9c05e9b
  Vinh Nguyen authored Aug 20, 2019
  
  c9c05e9b
- change default fp16_implementation to graph_rewrite · c186c85a
  Vinh Nguyen authored Aug 20, 2019
  
  c186c85a
19 Aug, 2019 1 commit

Do not expose --max_train_steps in models that do not use it. · 824ff2d6

Reed Wanderman-Milne authored Aug 19, 2019

Only the V1 resnet model uses --max_train_steps. This unexposes the flag in the keras_application_models, mnist, keras resnet, CTL resnet Models. Before this change, such models allowed the flag to be specified, but ignored it.

I also removed the "max_train" argument from the run_synthetic function, since this only had any meaning for the V1 resnet model. Instead, the V1 resnet model now directly passes --max_train_steps=1 to run_synthetic.

PiperOrigin-RevId: 264269836

824ff2d6

16 Aug, 2019 1 commit

Add multi-worker benchmarks to Keras ResNet model. · ff6c3b1e

Ayush Dubey authored Aug 16, 2019

Also add `worker_hosts` and `task_index` flags.  These flags enable running the
model over multiple hosts by passing the cluster information via command line.

Setting `TF_CONFIG` will continue to work.

PiperOrigin-RevId: 263825245

ff6c3b1e

06 Aug, 2019 1 commit

[ResNet / NCF] Test force V1 path and allow V2 path as default (#7383) · 97622ffc

Toby Boyd authored Aug 05, 2019

* force_v2_in_keras_compile FLAG default to None and added seperate temp path.

* switch to force testing 1v path not force v2 path.

* Rename function force_v1_path.

97622ffc

23 Jul, 2019 1 commit

Single execution path tests for ResNet50, ResNet56, NCF, and Shakespeare LSTM. (#7276) · 9d8c9aa4

Toby Boyd authored Jul 23, 2019

* Add force_run_distributed tests.

* Added enable_eager

* r/force_run_distributed/force_v2_in_keras_compile

* Adding force_v2 tests and FLAGs.

* Rename method to avoid conflict.

* Add cpu force_v2 tests.

* fix lint, wrap line.

* change to force_v2_in_keras_compile

* Update method name.

* Lower mlperf target to 0.736.

9d8c9aa4

21 Jun, 2019 2 commits

Fix help print error when stdout/stderr not use utf-8 encoding (#7079) · 0f6845ce
Neil authored Jun 22, 2019

0f6845ce

NCF XLA and Eager tests with a refactor of resnet flags to make this cleaner. (#7067) · a68f65f8

Toby Boyd authored Jun 21, 2019

* XLA FP32 and first test

* More XLA benchmarks FP32.

* Add eager to NCF and refactor resnet.

* fix v2_0 calls and more flag refactor.

* Remove extra flag args.

* 90 epoch default

* add return

* remove xla not used by estimator.

* Remove duplicate run_eagerly.

* fix flag defaults.

* Remove fp16_implementation flag option.

* Remove stop early on mlperf test.

* remove unneeded args.

* load flags from keras mains.

a68f65f8

19 Jun, 2019 1 commit

Add XLA to transformer (#7048) · 269581dc

Toby Boyd authored Jun 19, 2019



* set default steps to 300K.

* Log flags to perfzero.

* Add XLA support to transformer

- Moved config logic to keras_utils
- Added enable_xla flag to _performance flags
- Did not refactor enable_xla flag from keras resnet due to
  reliance on calling FLAGs in estimator keras and that is
  a needed refactor for another time.

* fix g3 lint complaint.

* Refactor set config into keras_utils.

* Move flags out of main.

* pipe through enable_xla

* Update official/transformer/v2/misc.py
Co-Authored-By: Reed <reedwm@google.com>

269581dc

06 Jun, 2019 1 commit

Have each model provide a default loss scale. (#6930) · 42a8af1d

Reed authored Jun 06, 2019

Before, there was a global default loss scale for all models. Currently, only resnet uses loss scaling, but this will be useful once more models support it.

42a8af1d

18 May, 2019 1 commit
- Include flags when reporting benchmark. (#6809) · bdae51af
  Reed authored May 17, 2019
```
This will allow one to easily reproduce a benchmark by running with the flags.
```
  bdae51af
15 May, 2019 1 commit

Adds keras imagenet benchmarks which use tf.data's `experimental_slack` option. (#6744) · 6aa6bac5

Rachel Lim authored May 15, 2019

* Added 'tfdata_exp' version of all benchmarks which set
FLAGS.tf_data_experimental_slack = True. Renamed
`data_prefetch_with_slack` to `data_delay_prefetch` (haoyu's change)
to make the names more distinct.

* Add flag to resnet input pipeline and surface through
keras_imagenet_main.py

6aa6bac5

11 May, 2019 1 commit

Add FP16 to transformer with benchmark tests. (#6756) · b7e97bec

Toby Boyd authored May 10, 2019

* Add FP16 and benchmarks.

* add missing run and report.

* Add loss_scale as option not included with dtype.

* move loss_scale validation under dtype conditional.

* add loss_scale to flags tested.

b7e97bec

01 May, 2019 1 commit

Add --fp16_implementation option. (#6703) · b691578c

Reed authored May 01, 2019

This options allows the new tf.train.experimental.enable_mixed_precision_graph_rewrite() function to be used for fp16, instead of manual casts.

b691578c