Commits · 7033c8a26c150eefcedf43338d5d39a32ab0e3a1 · ModelZoo / ResNet50_tensorflow

24 May, 2019 2 commits

Add early stopping logic to ncf keras when desired threshold is met. Also... · 7033c8a2

Priya Gupta authored May 23, 2019

Add early stopping logic to ncf keras when desired threshold is met. Also change the default batch size to match the tuned hyperparams

7033c8a2

Merged commit that fixes transformer's predict and eval. (#6874) · b9cab01b

Tian Lin authored May 24, 2019

* Merged commit includes the following changes:
249776315  by tianlin<tianlin@google.com>:

    Internal change

249763206  by tianlin<tianlin@google.com>:

    For TF 2.0 (related to Beam Search), expand cond dims in tf.where(cond, x, y) to make all parameters broadcastable.

--
249392724  by hongkuny<hongkuny@google.com>:

    Internal change

PiperOrigin-RevId: 249776315

* Merged commit includes the following changes:
249823043  by tianlin<tianlin@google.com>:

    Bring back v2 test for predict and eval.

--

PiperOrigin-RevId: 249823043

b9cab01b

23 May, 2019 2 commits

NCF Keras: Add validation every epoch · abe9e96a

guptapriya authored May 23, 2019

Adding validation every epoch allows us to view the progress during training instead of having to wait until the last eval. Mostly useful for manual runs.

abe9e96a

Change batch size and epochs for NCF benchmarks · e8f97a1d

guptapriya authored May 23, 2019

Current batch size 160000 does not converge to the desired HR. So we decrease to 99k which is known to converge. Tested locally and got to 63.5 at epoch 7. Also decreasing number of epochs as I don't see any improvement after epoch 7-8.

e8f97a1d

15 May, 2019 1 commit

Set the --clone_model_in_keras_dist_strat to None. (#6781) · 2d4cfad0

Igor authored May 15, 2019

* Set the --clone_model_in_keras_dist_strat to None.  Remove the separate no_cloning benchmarks and add a couple of cloning ones.  Fixes the learning rate schedule to cache its ops per graph.

2d4cfad0

08 May, 2019 1 commit
- r/tf.random_uniform/tf.random.uniform (#6735) · 9c5253f1
  Toby Boyd authored May 08, 2019
  
  9c5253f1
29 Apr, 2019 3 commits

Replace per_device with per_replica and PerDevice with PerReplica, because the... · b00783d7

Igor authored Apr 29, 2019

Replace per_device with per_replica and PerDevice with PerReplica, because the PerDevice concept was renamed and doesn't exist anymore. (#6693)

* Replace per_device with per_replica and PerDevice with PerReplica, because the PerReplica concept was renamed and doesn't exist anymore.

b00783d7

Add accuracy check. (#6694) · 294660bd

Toby Boyd authored Apr 29, 2019

* Add accuracy check.

* Avoid double flag init, move data_dir to real data.

* Comment on lower accuracy target.

294660bd

Add benchmarks with the --cloning flag to Resnet and NFC. (#6675) · af47736d

Igor authored Apr 29, 2019

* Add benchmarks with the --cloning flag to Resnet and NFC.

* Renamed cloning to clone_model_in_keras_dist_strat. Dropped a few tests that aren't essential.

* Fixed up the formatting after re-naming the flag to a much longer  name.  Thanks, lint.
* Fixed the lint error in nfc_common.py

af47736d

22 Apr, 2019 2 commits
- Ncf metric tweaks (#6633) · 042c9aaa
  Toby Boyd authored Apr 22, 2019
```
* Use tf.image.resize_with_crop_or_pad

* exp_per_second and hr_at_10
```
  042c9aaa
- Add usernames to TODOs (#6619) · 5c37e69c
  Shining Sun authored Apr 22, 2019
  
  5c37e69c
20 Apr, 2019 2 commits

Add 2-GPU benchmark for NCF (#6589) · d11aa330
Shining Sun authored Apr 19, 2019

d11aa330

Remove contrib imports, or move them inline (#6591) · 8ff9eb54

Shining Sun authored Apr 19, 2019

* Remove contrib imports, or move them inline

* Use exposed API for FixedLenFeature

* Replace tf.logging with absl logging

* Change GFile to v2 APIs

* replace tf.logging with absl loggin in movielens

* Fixing an import bug

* Change gfile to v2 APIs in code

* Swap to keras optimizer v2

* Bug fix for optimizer

* Change tf.log to tf.keras.backend.log

* Change the loss function to keras loss

* convert another loss to keras loss

* Resolve comments and fix lint

* Add a doc string

* Fix existing tests and add new tests for DS

* Added tests for multi-replica

* Fix lint

* resolve comments

* make estimator run in tf2.0

* use compat v1 loss

* fix lint issue

8ff9eb54

18 Apr, 2019 1 commit
- Fix the batch_size of the ncf benchmark (#6597) · f519c015
  Shining Sun authored Apr 18, 2019
  
  f519c015
08 Apr, 2019 1 commit

Add DS support for NCF keras (#6447) · 1255d5b9

Shining Sun authored Apr 08, 2019

* add ds support for ncf

* remove comments for in_top_k

* avoid expanding the input layers

* resolve comments and fix lint

* Added some comments in code and fix lint

* fix lint

* add some documentation

* add tensorflow imports

1255d5b9

02 Apr, 2019 1 commit
- added missing flags to the real data benchmark (#6483) · 74d924e9
  Shining Sun authored Apr 02, 2019
  
  74d924e9
28 Mar, 2019 1 commit

Added benchmark test and convergence test for the NCF model (#6318) · 4c11b84b

Shining Sun authored Mar 28, 2019

* initial commit

* bug fix

* Move build_stats from common to keras main, because it is only applicable in keras

* remove tailing blank line

* add test for synth data

* add kwargs to init

* add kwargs to function invokation

* correctly pass kwargs

* debug

* debug

* debug

* fix super init

* bug fix

* fix local_flags

* fix import

* bug fix

* fix log_steps flag

* bug fix

* bug fix: add missing return value

* resolve double-defined flags

* lint fix

* move log_steps flag to benchmarK flag

* fix lint

* lint fix

* lint fix

* try flag core default values

* bug fix

* bug fix

* bug fix

* debug

* debug

* remove debug prints

* rename benchmark methods

* flag bug fix for synth benchmark

4c11b84b

27 Mar, 2019 1 commit

Change function signature (#6459) · 0b2b8997

cclauss authored Mar 27, 2019

* from NCF_input import NCFDataset for line 181

The type __NCFDataset__ is used in the type declaration on line 81 but it is never imported.

[flake8](http://flake8.pycqa.org) testing of https://github.com/tensorflow/models on Python 3.7.1

$ __flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics__
```
./official/recommendation/data_preprocessing.py:180:3: F821 undefined name 'NCFDataset'
  # type: (str, str, dict, typing.Optional[str], bool, typing.Optional[str]) -> (NCFDataset, typing.Callable)
  ^
1    F821 undefined name 'NCFDataset'
1
```
__E901,E999,F821,F822,F823__ are the "_showstopper_" [flake8](http://flake8.pycqa.org) issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.
* F821: undefined name `name`
* F822: undefined name `name` in `__all__`
* F823: local variable name referenced before assignment
* E901: SyntaxError or IndentationError
* E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

* int, int, data_pipeline.BaseDataConstructor

0b2b8997

26 Mar, 2019 1 commit

Python typing: Use 'str', not 'string' (#6422) · e2ef6108

cclauss authored Mar 26, 2019

https://mypy.readthedocs.io/en/latest/cheat_sheet.html

[flake8](http://flake8.pycqa.org) testing of https://github.com/tensorflow/models on Python 3.7.1

$ __flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics__
```
./official/recommendation/data_pipeline.py:346:41: F821 undefined name 'string'
               epoch_dir=None           # type: string
                                        ^
```

e2ef6108

18 Mar, 2019 1 commit
- Add support for TPUEstimator to data processing pipeline and add the … (#6330) · cf304238
  Bruce Fontaine authored Mar 18, 2019
```
* Add support for TPUEstimator to data processing pipeline and add the ability to store epochs in user specified location.
```
  cf304238
13 Mar, 2019 1 commit

Fix ncf test for keras (#6355) · dadc4a62

Shining Sun authored Mar 13, 2019

* Fix ncf test for keras

* add a todo for batch_size and eval_batch_size for ncf keras

* lint fix

* fix typos

* Lint fix

* fix lint

* resolve pr comment

* resolve pr comment

dadc4a62

02 Mar, 2019 1 commit
- fix resnet breakage and add keras end-to-end tests (#6295) · 8367cf6d
  Taylor Robie authored Mar 02, 2019
```
* fix resnet breakage and add keras end-to-end tests

* delint

* address PR comments
```
  8367cf6d
01 Mar, 2019 1 commit

Keras-fy NCF Model (#6092) · 048e5bff

Shining Sun authored Mar 01, 2019

* tmp commit

* tmp commit

* first attempt (without eval)

* Bug fixes

* bug fixes

* training done

* Loss NAN, no eval

* Loss weight problem solved

* resolve the NAN loss problem

* Problem solved. Clean up needed

* Added a todo

* Remove debug prints

* Extract get_optimizer to ncf_common

* Move metrics computation back to neumf; use DS.scope api

* Extract DS.scope code to utils

* lint fixes

* Move obtaining DS above producer.start to avoid race condition

* move pt 1

* move pt 2

* Update the run script

* Wrap keras_model related code into functions

* Update the doc for softmax_logitfy and change the method name

* Resolve PR comments

* working version with: eager, DS, batch and no masks

* Remove git conflict indicator

* move reshape to neumf_model

* working version, not converge

* converged

* fix a test

* more lint fix

* more lint fix

* more lint fixes

* more lint fix

* Removed unused imports

* fix test

* dummy commit for kicking of checks

* fix lint issue

* dummy input to kick off checks

* dummy input to kick off checks

* add collective to dist strat

* addressed review comments

* add a doc string

048e5bff

30 Jan, 2019 1 commit
- Explicitly allow for script execution from any directory. Make env vars... · b3158fb0
  Tayo Oguntebi authored Jan 29, 2019
```
Explicitly allow for script execution from any directory.  Make env vars visible in python script. (#6105)
```
  b3158fb0
08 Jan, 2019 5 commits
- missed a not · 7021ac1c
  Taylor Robie authored Jan 08, 2019
  
  7021ac1c
- don't use forkpool to shuffle with TPUs · f1efaf83
  Taylor Robie authored Jan 08, 2019
  
  f1efaf83
- update call to TPUStrategy · c8be4828
  Taylor Robie authored Jan 08, 2019
  
  c8be4828
- restore data test · 322def29
  Taylor Robie authored Jan 08, 2019
  
  322def29
- completely disable data_test · 3bbb62e7
  Taylor Robie authored Jan 07, 2019
  
  3bbb62e7
07 Jan, 2019 11 commits
- disable more tests · 16e0c773
  Taylor Robie authored Jan 07, 2019
  
  16e0c773
- disable tests to debug test failure · a8d10447
  Taylor Robie authored Jan 07, 2019
  
  a8d10447
- fix test now that cache construction no longer uses match_mlperf · ea36125a
  Taylor Robie authored Jan 07, 2019
  
  ea36125a
- remove match_mlperf from expected cache keys · fefe47ee
  Taylor Robie authored Jan 07, 2019
  
  fefe47ee
- Revert "remove mock to debug kokoro failures" · e0f26727
  Taylor Robie authored Jan 07, 2019
```
This reverts commit 63f5827d.
```
  e0f26727
- remove mock to debug kokoro failures · 63f5827d
  Taylor Robie authored Jan 07, 2019
  
  63f5827d
- fix lint errors · 0fbc71fc
  Taylor Robie authored Jan 07, 2019
  
  0fbc71fc
- address more PR comments · 6726c5e0
  Taylor Robie authored Jan 07, 2019
  
  6726c5e0
- address PR comments · 1bb074b0
  Taylor Robie authored Jan 07, 2019
  
  1bb074b0
- add unbuffer to run.sh as tee is causing issues · 444f5993
  Taylor Robie authored Dec 27, 2018
  
  444f5993
- skip bisection when it is not needed · 4cdea1cc
  Taylor Robie authored Dec 27, 2018
  
  4cdea1cc