- 24 May, 2019 2 commits
-
-
Toby Boyd authored
* Moved common keras code to utils. * Initial 1 gpu benchmark - Aligned flags with resnet example - removed code/features that are not super useful - eval as part of train if bleu source/ref provided - add exp_per_second hook * Rename benchmark classes, pass batch-size and log_steps. * fix docstring * Predict done with checkpoints inline - perfzero baseclass * steps not epochs with smoother training loop. * do not initialize history outside loop. * 5000 between eval not 500 * estimator to keras. * remove epochs var. * use range not xrange. * 200K steps for 1 gpu * fix global step
-
Tian Lin authored
* Merged commit includes the following changes: 249776315 by tianlin<tianlin@google.com>: Internal change 249763206 by tianlin<tianlin@google.com>: For TF 2.0 (related to Beam Search), expand cond dims in tf.where(cond, x, y) to make all parameters broadcastable. -- 249392724 by hongkuny<hongkuny@google.com>: Internal change PiperOrigin-RevId: 249776315 * Merged commit includes the following changes: 249823043 by tianlin<tianlin@google.com>: Bring back v2 test for predict and eval. -- PiperOrigin-RevId: 249823043
-
- 22 May, 2019 3 commits
-
-
Toby Boyd authored
-
Toby Boyd authored
* Add big tests. * fix super * Add fp16, increase 8xGPU batch-sizes * Adding the rest of the fp16 tests. * Big accuracy test batch_perf_gpu * fix docstrings * add _run_and_report * Edited docstrings
-
Tian Lin authored
* Merged commit includes the following changes: 249218656 by tianlin<tianlin@google.com>: Deal with imports, fix a typo and make unit tests fast. -- 249198645 by tianlin<tianlin@google.com>: Trivial: Remove one empty line before "import tensorflow" -- 249195490 by tianlin<tianlin@google.com>: Initialize Transformer TF V2 Model with Keras subclassing implementation. (Compatible with TF V1) -- 249195008 by tianlin<tianlin@google.com>: Internal change 249173564 by hongkuny<hongkuny@google.com>: Internal change 249079258 by hongkuny<hongkuny@google.com>: Internal change 247691534 by haoyuzhang<haoyuzhang@google.com>: Internal change 247533725 by haoyuzhang<haoyuzhang@google.com>: Internal change 247509295 by haoyuzhang<haoyuzhang@google.com>: Internal change 247311355 by wangtz<wangtz@google.com>: Internal change 247303127 by wangtz<wangtz@google.com>: ...
-
- 11 May, 2019 1 commit
-
-
Toby Boyd authored
* Add FP16 and benchmarks. * add missing run and report. * Add loss_scale as option not included with dtype. * move loss_scale validation under dtype conditional. * add loss_scale to flags tested.
-
- 09 May, 2019 1 commit
-
-
Toby Boyd authored
* Add first benchmark and return stats. * Remove print statements update training steps. * Revert print T: in print statement. * Remove print(stats) * add 2 gpu accuracy test for base. * Fixed total_batch_size when using gpu + gFile deprecations. * 8 GPU test name fix * Add 4 and 8 GPU tests. * typo fixes. * Clean up test names and methods. * bleu uncased. docstring format fix.
-
- 07 May, 2019 1 commit
-
-
Toby Boyd authored
-
- 29 Apr, 2019 2 commits
-
-
Igor authored
Replace per_device with per_replica and PerDevice with PerReplica, because the PerDevice concept was renamed and doesn't exist anymore. (#6693) * Replace per_device with per_replica and PerDevice with PerReplica, because the PerReplica concept was renamed and doesn't exist anymore.
-
Songyi Blair Han authored
-
- 12 Apr, 2019 1 commit
-
-
Yash Katariya authored
* Update README.md * Update README.md * Update README.md
-
- 13 Feb, 2019 1 commit
-
-
Yuefeng Zhou authored
* Add a flag to specify distribution strategies. * Fix a small error. * Address comments. * Address comments. * Fix typos.
-
- 02 Feb, 2019 1 commit
-
-
Paige Bailey authored
-
- 15 Jan, 2019 1 commit
-
-
wangtz authored
It currently fails with TypeError: not all arguments converted during string formatting
-
- 20 Dec, 2018 1 commit
-
-
Mark Daoust authored
For tf2 this will only be available in `compat.v1`.
-
- 17 Dec, 2018 1 commit
-
-
bananabowl authored
Explicitly pass values kwarg to tf.name_scope as it is currently being treated as the default_name kwarg instead. This causes an exception to be thrown in eager mode.
-
- 04 Oct, 2018 1 commit
-
-
Taylor Robie authored
* set strip_default_attrs=True for SavedModel exports * specify dtype in resnet export * another dtype fix * fix another dtype issue, and set --image_bytes_as_serving_input to default to False
-
- 30 Aug, 2018 2 commits
-
-
Aman Gupta authored
Bypassing Export model step, if training on TPU's. As this need inference to be supported on TPU's. Remove this check once inference is supported. (#5209)
-
Aman Gupta authored
Bypassing Export model step, if training on TPU's. As this need inference to be supported on TPU's. Remove this check once inference is supported.
-
- 16 Aug, 2018 1 commit
-
-
Jules Gagnon-Marchand authored
* Deterministic dataset order fix In order for the order of the files to be deterministic, in `tf.data.Dataset.list_files(..., shuffle)`, shuffle needs to be True, otherwise different iterator inits will yield different file orders * removed unnecessary shuffle of filenames * Removed the `_FILE_SHUFFLE_BUFFER` definition
-
- 14 Aug, 2018 1 commit
-
-
alope107 authored
* Fix Transformer TPU crash in Python 2.X. - Tensorflow raises an error when tf_inspect.getfullargspec is called on a functools.partial in Python 2.X. This issue would be hit during the eval stage of the Transformer TPU model. This change replaces the call to functools.partial with a lambda to work around the issue. * Remove unused import from transformer_main. * Fix lint error.
-
- 26 Jul, 2018 1 commit
-
-
Jiang Yu authored
* fix batch_size in transformer_main.py fix batch_size in transformer_main.py which causes ResourceExhaustedError: OOM during training Transformer models using models/official/transformer * small format change change format from one line to multiple ones in order to pass lint tests * remove trailing space and add comment
-
- 11 Jul, 2018 1 commit
-
-
cclauss authored
* Use six and feature detection in string conversion Leverage [__six.ensure_text()__](https://github.com/benjaminp/six/blob/master/six.py#L890) to deliver Unicode text in both Python 2 and Python 3. Follow Python porting best practice [use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) in ___unicode_to_native()__. * Revert the use of six.ensure_text() Thanks for catching that! I jumped the gun. It is I who have brought shame...
-
- 26 Jun, 2018 1 commit
-
-
Billy Lamberta authored
-
- 22 Jun, 2018 1 commit
-
-
Katherine Wu authored
-
- 20 Jun, 2018 1 commit
-
-
Taylor Robie authored
* begin branch * finish download script * rename download to dataset * intermediate commit * intermediate commit * misc tweaks * intermediate commit * intermediate commit * intermediate commit * delint and update census test. * add movie tests * delint * fix py2 issue * address PR comments * intermediate commit * intermediate commit * intermediate commit * finish wide deep transition to vanilla movielens * delint * intermediate commit * intermediate commit * intermediate commit * intermediate commit * fix import * add default ncf csv construction * change default on download_if_missing * shard and vectorize example serialization * fix import * update ncf data unittests * delint * delint * more delinting * fix wide-deep movielens serialization * address PR comments * add file_io tests * investigate wide-deep test failure * remove hard coded path and properly...
-
- 18 Jun, 2018 1 commit
-
-
Taylor Robie authored
* remove unused imports and lint * fix schedule.py * address PR comments
-
- 12 Jun, 2018 2 commits
-
-
Katherine Wu authored
* Add checklist for official models. Remove file access from flag validator (causing issues with BUILD) * spelling * address PR comments
-
Katherine Wu authored
* Add DistributionStrategy to transformer model * add num_gpu flag * Calculate per device batch size for transformer * remove reference to flags_core * Add synthetic data option to transformer * fix typo * add import back in * Use hierarchical copy * address PR comments * lint * fix spaces * group train op together to fix single GPU error * Fix translate bug (sorted_keys is a dict, not a list) * Change params to a default dict (translate.py was throwing errors because params didn't have the TPU parameters.) * Address PR comments. Removed multi gpu flag + more * fix lint * fix more lints * add todo for Synthetic dataset * Update docs
-
- 07 Jun, 2018 1 commit
-
-
Katherine Wu authored
-
- 06 Jun, 2018 1 commit
-
-
Taylor Robie authored
* add tests for matmul embedding and schedule manager, as well as some minor cleanup * delint * address PR comments
-
- 05 Jun, 2018 1 commit
-
-
Katherine Wu authored
-
- 04 Jun, 2018 1 commit
-
-
Taylor Robie authored
* port changes from previous branch now that transformer util changes are in master fix incorrect count correct (hopefully) treatment of batch_size set eval_metrics to a dummy function for now add some comments start bringing metrics to transformer TPU resolve logits shape metrics are now working except for tf.py_func metrics increase batch_size for tpu, and create summary host call fix host call reduce tpu default batch size further tune batch sizes add minibatch loss to summary handle case of single_iteration_train_steps > number points in an epoch begin to incorporate hooks add sleep workarounds disable hooks altogether generalize host call function and move to newly created tpu utils module remove all traces of params as an object switch from to address some PR comments, and change the number of data points. minor tweaks add tpu dry run for testing, and use matmul for TPU embedding infeed/outfeed queue issue is fixed. Sleeps are no longer necessary add some documentation. cleanup and address PR comments delint add accelerator __init__ fix embedding missed PR comment address PR comments fix validator bug rewrite cloud storage validator, and add oauth dependency to requirements.txt * delint
-
- 01 Jun, 2018 2 commits
-
-
Qianli Scott Zhu authored
* Add new test ID and test env info to the benchmark run. * Fix test. * Fix lint * Address review comment.
-
Qianli Scott Zhu authored
* Update benchmark logger to update the run status. This is important for streaming upload to bigquery so that the dashboard can ignore the 'running' benchmark at the moment since its not finished yet. * Move the run status into a separate table. Also update the run status in the benchmark uploader and BigqueryBenchmarkLogger. * Insert instead of update for the benchmark status for file logger. * Address review comments. Update the logger to have benchmark context, which will update the run status accordingly. * Fix broken tests. * Move the benchmark logger context to main function. * Fix tests. * Update the rest of the models to use the context in main. * Delint.
-
- 15 May, 2018 1 commit
-
-
Katherine Wu authored
-
- 11 May, 2018 2 commits
-
-
Qianli Scott Zhu authored
* Update the wide_deep code for latest benchmark config. * Also update the transformer benchmark code.
-
Katherine Wu authored
-
- 02 May, 2018 1 commit
-
-
Katherine Wu authored
-