- 02 Aug, 2018 1 commit
-
-
Reed authored
The data_async_generation.py process would print to stderr, but the main process would redirect it's stderr to a pipe. The main process never read from the pipe, so when the pipe was full, data_async_generation.py would stall on a write to stderr. This change makes data_async_generation.py not write to stdout/stderr.
-
- 01 Aug, 2018 1 commit
-
-
Reed authored
The output of an embeddding layer is already flattened, so the Flatten layers acted as no-ops.
-
- 31 Jul, 2018 8 commits
-
-
Taylor Robie authored
-
Reed authored
* Fix crash when Python interpreter not on PATH. * Fix lint error.
-
Reed authored
-
Reed authored
-
Taylor Robie authored
* add indirection file * remove unused imports * fix import
-
Reed authored
-
Reed authored
-
Reed authored
-
- 30 Jul, 2018 2 commits
-
-
Taylor Robie authored
* intermediate commit * ncf now working * reorder pipeline * allow batched decode for file backed dataset * fix bug * more tweaks * parallize false negative generation * shared pool hack * workers ignore sigint * intermediate commit * simplify buffer backed dataset creation to fixed length record approach only. (more cleanup needed) * more tweaks * simplify pipeline * fix misplaced cleanup() calls. (validation works\!) * more tweaks * sixify memoryview usage * more sixification * fix bug * add future imports * break up training input pipeline * more pipeline tuning * first pass at moving negative generation to async * refactor async pipeline to use files instead of ipc * refactor async pipeline * move expansion and concatenation from reduce worker to generation workers * abandon complete async due to interactions with the tensorflow threadpool * cleanup * remove performance_comparison.py * experiment with rough generator + interleave pipeline * yet more pipeline tuning * update on-the-fly pipeline * refactor preprocessing, and move train generation behind a GRPC server * fix leftover call * intermediate commit * intermediate commit * fix index error in data pipeline, and add logging to train data server * make sharding more robust to imbalance * correctly sample with replacement * file buffers are no longer needed for this branch * tweak sampling methods * add README for data pipeline * fix eval sampling, and vectorize eval metrics * add spillover and static training batch sizes * clean up cruft from earlier iterations * rough delint * delint 2 / n * add type annotations * update run script * make run.sh a bit nicer * change embedding initializer to match reference * rough pass at pure estimator model_fn * impose static shape hack (revisit later) * refinements * fix dir error in run.sh * add documentation * add more docs and fix an assert * old data test is no longer valid. Keeping it around as reference for the new one * rough draft of data pipeline validation script * don't rely on shuffle default * tweaks and documentation * add separate eval batch size for performance * initial commit * terrible hacking * mini hacks * missed a bug * messing about trying to get TPU running * TFRecords based TPU attempt * bug fixes * don't log remotely * more bug fixes * TPU tweaks and bug fixes * more tweaks * more adjustments * rework model definition * tweak data pipeline * refactor async TFRecords generation * temp commit to run.sh * update log behavior * fix logging bug * add check for subprocess start to avoid cryptic hangs * unify deserialize and make it TPU compliant * delint * remove gRPC pipeline code * fix logging bug * delint and remove old test files * add unit tests for NCF pipeline * delint * clean up run.sh, and add run_tpu.sh * forgot the most important line * fix run.sh bugs * yet more bash debugging * small tweak to add keras summaries to model_fn * Clean up sixification issues * address PR comments * delinting is never over
-
Sundara Tejaswi Digumarti authored
Removed the conditional over distributed strategies when computing metrics. Metrics are now computed even when distributed strategies are used.
-
- 26 Jul, 2018 1 commit
-
-
Jiang Yu authored
* fix batch_size in transformer_main.py fix batch_size in transformer_main.py which causes ResourceExhaustedError: OOM during training Transformer models using models/official/transformer * small format change change format from one line to multiple ones in order to pass lint tests * remove trailing space and add comment
-
- 21 Jul, 2018 1 commit
-
-
Igor Ganichev authored
float32 should be fine for mnist loss and accuracy metrics and float64 is not available on TPUs.
-
- 20 Jul, 2018 1 commit
-
-
Yanhui Liang authored
* Add more arguments * Add eager mode * Add notes for eager mode * Address the comments * Fix argument typos * Add warning for eager and multi-gpu * Fix typo * Fix notes * Fix pylint
-
- 19 Jul, 2018 1 commit
-
-
Asim Shankar authored
-
- 13 Jul, 2018 2 commits
-
-
Yanhui Liang authored
* Add callbacks * Add readme * update readme * fix some comments * Address all comments * Update docstrings * Add method docstrings * Update callbacks * Add comments on global_step initialization * Some updates * Address comments
-
Qianli Scott Zhu authored
* Add shorter timeout for GCP util. * Add comment for change reason and unit for timeout.
-
- 12 Jul, 2018 1 commit
-
-
Taylor Robie authored
-
- 11 Jul, 2018 1 commit
-
-
cclauss authored
* Use six and feature detection in string conversion Leverage [__six.ensure_text()__](https://github.com/benjaminp/six/blob/master/six.py#L890) to deliver Unicode text in both Python 2 and Python 3. Follow Python porting best practice [use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) in ___unicode_to_native()__. * Revert the use of six.ensure_text() Thanks for catching that! I jumped the gun. It is I who have brought shame...
-
- 10 Jul, 2018 1 commit
-
-
Eliel Hojman authored
* Rename of files in README.md The name of the files to run specified in the README.md file were not updated. * Missed one file name change
-
- 26 Jun, 2018 1 commit
-
-
Billy Lamberta authored
-
- 25 Jun, 2018 2 commits
-
-
Qianli Scott Zhu authored
-
Qianli Scott Zhu authored
-
- 22 Jun, 2018 1 commit
-
-
Katherine Wu authored
-
- 20 Jun, 2018 1 commit
-
-
Taylor Robie authored
* begin branch * finish download script * rename download to dataset * intermediate commit * intermediate commit * misc tweaks * intermediate commit * intermediate commit * intermediate commit * delint and update census test. * add movie tests * delint * fix py2 issue * address PR comments * intermediate commit * intermediate commit * intermediate commit * finish wide deep transition to vanilla movielens * delint * intermediate commit * intermediate commit * intermediate commit * intermediate commit * fix import * add default ncf csv construction * change default on download_if_missing * shard and vectorize example serialization * fix import * update ncf data unittests * delint * delint * more delinting * fix wide-deep movielens serialization * address PR comments * add file_io tests * investigate wide-deep test failure * remove hard coded path and properly use flags. * address file_io test PR comments * missed a hash_bucked_size
-
- 18 Jun, 2018 1 commit
-
-
Taylor Robie authored
* remove unused imports and lint * fix schedule.py * address PR comments
-
- 12 Jun, 2018 2 commits
-
-
Katherine Wu authored
* Add checklist for official models. Remove file access from flag validator (causing issues with BUILD) * spelling * address PR comments
-
Katherine Wu authored
* Add DistributionStrategy to transformer model * add num_gpu flag * Calculate per device batch size for transformer * remove reference to flags_core * Add synthetic data option to transformer * fix typo * add import back in * Use hierarchical copy * address PR comments * lint * fix spaces * group train op together to fix single GPU error * Fix translate bug (sorted_keys is a dict, not a list) * Change params to a default dict (translate.py was throwing errors because params didn't have the TPU parameters.) * Address PR comments. Removed multi gpu flag + more * fix lint * fix more lints * add todo for Synthetic dataset * Update docs
-
- 11 Jun, 2018 1 commit
-
-
Taylor Robie authored
* fix comment * align comment
-
- 08 Jun, 2018 1 commit
-
-
Yanhui Liang authored
-
- 07 Jun, 2018 1 commit
-
-
Katherine Wu authored
-
- 06 Jun, 2018 3 commits
-
-
Qianli Scott Zhu authored
* Improve the BenchmarkFileLogger performance. Open and preserve the file handler of the metric log file during init, which reduce the overhead of open/close the file for each log_metric call. * Address review comment.
-
Taylor Robie authored
* add .take() to dataset pipeline * delint * address PR comments
-
Taylor Robie authored
* add tests for matmul embedding and schedule manager, as well as some minor cleanup * delint * address PR comments
-
- 05 Jun, 2018 2 commits
-
-
Katherine Wu authored
-
Qianli Scott Zhu authored
This will prevent the unit test to read the local config of the GCP API, which does not necessary to exist for the test environment.
-
- 04 Jun, 2018 3 commits
-
-
Qianli Scott Zhu authored
-
Taylor Robie authored
-
Taylor Robie authored
-